CDR Tickets

Issue Number 3408
Summary [Summaries] Global to add Govt Employee Element in All Board Member Docs
Created 2011-08-24 09:32:19
Issue Type Improvement
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2011-10-03 13:51:32
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107736
Description

BZISSUE::5101
BZDATETIME::2011-08-24 09:32:19
BZCREATOR::Robin Juthe
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku

As discussed in last week's CDR meeting, we would like to add the required Government Employee element to all Board member documents that don't already have it and populate it with the value "Unknown" for any Board member documents that do not already contain the element.

The related OCECDR-3295 added this element and populated it for all current Board members.

Comment entered 2011-08-24 11:55:59 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-08-24 11:55:59
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1

It's maybe not a bad idea to add our Globals Expert as a CC so that he can jump in before I'm trying something stupid.

Comment entered 2011-08-24 14:55:00 by alan

BZDATETIME::2011-08-24 14:55:00
BZCOMMENTOR::Alan Meyer
BZCOMMENT::2

I suggest that we do a walkthrough tomorrow of the basic tools that exist for doing global changes. The tools handle figuring out what versions to modify and how and where to save them in test or live mode, and keeping all the statistics, so you can concentrate on the actual transformation (we've been using either XSLT or lxml) and the database selection.

Comment entered 2011-08-24 15:31:18 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-08-24 15:31:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3

Adding dependency.

Comment entered 2011-08-31 14:15:10 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-08-31 14:15:10
BZCOMMENTOR::Volker Englisch
BZCOMMENT::4

A first test run for the global change has been implemented on MAHLER.

This is ready for review on MAHLER.

http://mahler.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?Session=guest

Comment entered 2011-09-06 11:57:46 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-09-06 11:57:46
BZCOMMENTOR::Robin Juthe
BZCOMMENT::5

(In reply to comment #4)
> A first test run for the global change has been implemented on MAHLER.
> This is ready for review on MAHLER.
> http://mahler.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?Session=guest

Verified test run. Please run in live mode on Mahler.

Comment entered 2011-09-07 17:02:46 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-07 17:02:46
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6

I ran the live run on MAHLER.

There were some documents that were still invalid after the global change but most of the documents are now valid, i.e. CDR669318, CDR468943, CDR689795.

Comment entered 2011-09-07 18:12:10 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-07 18:12:10
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7

Alan said he usually attaches the log-file to the Bugzilla issue which is probably a good idea.

Log file attached.

Comment entered 2011-09-07 18:12:10 by Englisch, Volker (NIH/NCI) [C]

Attachment ModifyDocs.log has been added with description: Log file for Live run on MAHLER

Comment entered 2011-09-22 11:09:29 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-09-22 11:09:29
BZCOMMENTOR::Robin Juthe
BZCOMMENT::8

(In reply to comment #7)
> Created attachment 2153 [details]
> Log file for Live run on MAHLER
> Alan said he usually attaches the log-file to the Bugzilla issue which is
> probably a good idea.
> Log file attached.

This doesn't appear to be the right log file for this issue.

Comment entered 2011-09-22 11:50:16 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-09-22 11:50:16
BZCOMMENTOR::Robin Juthe
BZCOMMENT::9

(In reply to comment #6)
> There were some documents that were still invalid after the global change but
> most of the documents are now valid, i.e. CDR669318, CDR468943, CDR689795.

Each of these appear to be schema validation errors. I corrected the error in CDR468943. The other two documents have an error related to the Specific Board Member Contact block. I tried moving it in CDR669318, but the doc is still invalid. I can't seem to figure it out. Could we have a complete list of docs with validation errors? (that will probably be in the log file) Then I might be able to see the pattern. Thanks.

Comment entered 2011-09-23 13:02:51 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-23 13:02:51
BZCOMMENTOR::Volker Englisch
BZCOMMENT::10

I would have to agree with Robin. Although the log file attached earlier was very informative in itself, it didn't have the same impact when compared to the current issue. So this time I'm attaching the log file for the documents that were actually modified.

The content of this log file is a little bit different than usual for the following reason. The general rule for the globals is to not validate a document that was invalid before the update once the update took place. The general thinking is that we only worry about validation for documents that started out to be valid.
In this case, however, the global change will actually make a formally invalid document valid. This is reflected in the log file by the extra rows like
2011-09-06 16:08:27: Processing CDR0000369818 [/last:4/cwd:4]
2011-09-06 16:08:29: DocValid (before): No
2011-09-06 16:08:30: DocValid (after): Yes
2011-09-06 16:08:31: DocValid (before): No
2011-09-06 16:08:31: DocValid (after): Yes
2011-09-06 16:08:31: saveDoc(369818, ver='Y' pub='N' val='N' old cwd)
2011-09-06 16:08:31: saveDoc(369818, ver='Y' pub='N' val='Y' new ver)

The first row displays the document ID and the versions. Then the versions that need to be updated and resaved (cwd, last version, publishable version) are being validated, updated, re-validated, and saved.
You expect to see the scenario
DocValid (before): No
DocValid (after): Yes
for the individual version. Obviously, this is not the case if a document was invalid not just because of the missing Govm't Employee element but due to additional problems.

Warning for CDR0000624378: Invalid value: ... in element CurrentMember
Warning for CDR0000624378: Invalid value: ... in element CurrentMember
Warning for CDR0000669318: No match found in content model for type PDQBoardMemberInfo
Warning for CDR0000669318: Invalid date value: '2010-01-27' in element InvitationDate
Warning for CDR0000468943: No match found in content model for type BoardMembershipDetails
Warning for CDR0000369774: Invalid date value: '2009-4-13' in element DateSigned
Warning for CDR0000689795: No match found in content model for type SpecificBoardMemberContact
Warning for CDR0000369832: Invalid date value: '2009-3-25' in element DateSigned
Warning for CDR0000369835: Invalid value: ... in element TerminationReason
Warning for CDR0000369773: Invalid date value: '2009-3-30' in element DateSigned
Warning for CDR0000369815: No match found in content model for type BoardMembershipDetails
Warning for CDR0000369815: Invalid date value: ... in element InvitationDate
Warning for CDR0000369815: Unable to find type for element TermEndReason
Warning for CDR0000680250: Expected child elements for empty BoardMemberAssistant element of type BoardMemberAssistant

Sometimes you will also see the information
Doc CDR0000369867 already has GovtEmpl element - skipping
which just means that somebody (most likely myself or a test run of the global) had already fixed the document manually at some point.

I will look into these warning messages and try making the documents valid on MAHLER.

Comment entered 2011-09-23 13:02:51 by Englisch, Volker (NIH/NCI) [C]

Attachment Bug5101m.log has been added with description: Log file for Live run on MAHLER

Comment entered 2011-09-23 13:20:23 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-23 13:20:23
BZCOMMENTOR::Volker Englisch
BZCOMMENT::11

I've checked on FRANCK and for those documents that failed on MAHLER all but one document was valid after adding the Govm't Employee element.
That one document
Warning for CDR0000369835: Invalid value: ... in element TerminationReason
listed an incorrect TerminationReason.

I suggest to run the global on FRANCK in test/live mode next.

Comment entered 2011-09-23 16:55:40 by alan

BZDATETIME::2011-09-23 16:55:40
BZCOMMENTOR::Alan Meyer
BZCOMMENT::12

(In reply to comment #10)
> ...
> In this case, however, the global change will actually make a formally invalid
> document valid. This is reflected in the log file by the extra rows like
> 2011-09-06 16:08:27: Processing CDR0000369818 [/last:4/cwd:4]
> 2011-09-06 16:08:29: DocValid (before): No
> 2011-09-06 16:08:30: DocValid (after): Yes
...

That's a useful capability. We ought to add it, at least as an option, to the ModifyDocs module to make it available to all global changes.

Comment entered 2011-09-27 16:41:30 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-09-27 16:41:30
BZCOMMENTOR::Robin Juthe
BZCOMMENT::13

(In reply to comment #11)
> I've checked on FRANCK and for those documents that failed on MAHLER all but
> one document was valid after adding the Govm't Employee element.
> That one document
> Warning for CDR0000369835: Invalid value: ... in element TerminationReason
> listed an incorrect TerminationReason.
> I suggest to run the global on FRANCK in test/live mode next.

I agree. Let's run in live mode on FRANCK.

Comment entered 2011-09-27 18:23:24 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-27 18:23:24
BZCOMMENTOR::Volker Englisch
BZCOMMENT::14

I ran the global change on FRANCK in live mode.

There were again the two warnings we already saw on MAHLER:
Warning for CDR0000369835: Invalid value: 'Asked to leave/did not return emails' in element TerminationReason

Warning for CDR0000680250: Expected child elements for empty BoardMemberAssistant element of type BoardMemberAssistant

I did think that it was odd that there were fewer documents examined on FRANCK (138) than on MAHLER (192) and I will have to look at that tomorrow.

Most of the documents are now valid due to the change and for those that are not that's mostly due to missing data.

Please have a look at the attached log file.

Comment entered 2011-09-27 18:23:24 by Englisch, Volker (NIH/NCI) [C]

Attachment Request5101_f.log has been added with description: Log file for Live run on FRANCK

Comment entered 2011-09-29 11:10:30 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-09-29 11:10:30
BZCOMMENTOR::Robin Juthe
BZCOMMENT::15

(In reply to comment #14)
> Created attachment 2162 [details]
> Log file for Live run on FRANCK
> I ran the global change on FRANCK in live mode.
> There were again the two warnings we already saw on MAHLER:
> Warning for CDR0000369835: Invalid value: 'Asked to leave/did not return
> emails' in element TerminationReason
> Warning for CDR0000680250: Expected child elements for empty
> BoardMemberAssistant element of type BoardMemberAssistant
> I did think that it was odd that there were fewer documents examined on FRANCK
> (138) than on MAHLER (192) and I will have to look at that tomorrow.
> Most of the documents are now valid due to the change and for those that are
> not that's mostly due to missing data.
> Please have a look at the attached log file.

Verified on Franck. The discrepancy is probably due to Franck being a fresher set of data.

Since the couple validation errors are related to the data in the records and not related to the government employee element, let's proceed with a test run on Bach. If the same errors are identified (and they likely will be), we can correct the documents manually on Bach.

Comment entered 2011-09-29 11:33:30 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-29 11:33:30
BZCOMMENTOR::Volker Englisch
BZCOMMENT::16

I ran the global on BACH in test mode. There were no errors but a few documents that where still invalid after the change.
Attached is the log file and here is the link for the test results:
http://bach.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?Session=guest

Comment entered 2011-09-29 11:33:30 by Englisch, Volker (NIH/NCI) [C]

Attachment Request5101_b_test.log has been added with description: Log file for Test run on BACH

Comment entered 2011-09-29 16:24:24 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-29 16:24:24
BZCOMMENTOR::Volker Englisch
BZCOMMENT::17

I ran the Live run on BACH and there were four error messages. It looks like these are the same that we had already seen on the other two systems.

The log file is attached.

Comment entered 2011-09-29 16:24:24 by Englisch, Volker (NIH/NCI) [C]

Attachment Request5101_b_live.log has been added with description: Log file for Live run on BACH

Comment entered 2011-10-03 13:51:16 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-10-03 13:51:16
BZCOMMENTOR::Robin Juthe
BZCOMMENT::18

Verified on Bach.

Comment entered 2011-10-03 13:51:32 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-10-03 13:51:32
BZCOMMENTOR::Robin Juthe
BZCOMMENT::19

Closing issue. Thanks!

Attachments
File Name Posted User
Bug5101m.log 2011-09-23 13:02:51 Englisch, Volker (NIH/NCI) [C]
ModifyDocs.log 2011-09-07 18:12:10 Englisch, Volker (NIH/NCI) [C]
Request5101_b_live.log 2011-09-29 16:24:24 Englisch, Volker (NIH/NCI) [C]
Request5101_b_test.log 2011-09-29 11:33:30 Englisch, Volker (NIH/NCI) [C]
Request5101_f.log 2011-09-27 18:23:24 Englisch, Volker (NIH/NCI) [C]

Elapsed: 0:00:00.001053