Issue Number | 4954 |
---|---|
Summary | Global change to remove AltTitle elements that have Navlabel values |
Created | 2021-03-08 15:27:29 |
Issue Type | Improvement |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2021-03-09 19:08:55 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.286481 |
Please run a global change to remove all AltTitle elements (including data) that have been marked with the Navlabel value for the TitleType attribute.
I created the global change script Summary_AltTitle.py and started the first test run on DEV.
The results of the test run are available on DEV
Here are the stats for the global change run:
2021-03-09 20:36:01.844 [INFO] Run completed.
{{ Docs examined = 556}}
{{ Docs changed = 0}}
{{ Versions changed = 1576}}
{{ Could not lock = 0}}
{{ Errors = 0}}
{{ Time = 1:36:22.311558}}
I have started looking at the results but the display of the elements for the Diff makes it difficult to review. There are two long lines of text with all the elements on the same line. I have to scroll all the way to the right in order to see the changes. If this could be formatted so that I don't have to scroll to the right, that would be great.
I understand the issue but implementing an XML diff tool for this global change is probably out of scope. That is not a simple tool to build.
Also, if you're sorting the report by the documents diff size you will see that only the first 10-15% of the documents are showing the behavior you're describing. For the majority of the diffs the output does fit on a single page without horizontal scrolling needed (of course, that depends a little on the monitor size and resolution).
Test results look good on DEV. Please run in live mode on DEV.
2021-03-18 20:20:22.300 [INFO] Run completed.
{{ Docs examined = 552}}
{{ Docs changed = 549}}
{{ Versions changed = 1161}}
{{ Could not lock = 3}}
{{ Errors = 0}}
{{ Time = 2:35:33.052141}}
Specific versions saved:
{{ new cwd = 41}}
{{ new pub = 461}}
{{ new ver = 151}}
{{ old cwd = 123}}
The live mode on DEV finished (see the job summary in the comment above).
I will attach the log file in case you would like to see the blocked documents and those with warnings. I was surprised to see that many validation warnings (since the data came from PROD) but I guess that's OK.
Verified on DEV. Thanks! Please run in test mode on QA.
The errors appear to stem from the one glossary term with a rejected status - CDR0000302456 (There may be additional terms). (OCECDR-4950 is what is taking care of this issue during publishing). The global is likely to invalidate a lot of summaries especially on PROD. We would certainly prefer to fix this before we run the global in live mode on PROD.
Why would the global invalidate documents? The Navlabel version of the AltTitle is not mandatory and we're not making any changes to the schema.
Do you have an example of a document that became invalid?
This is one example from the logs which made me think that some of the documents would be invalidated, and there are several of them. Also, I did confirm from the live run that the CWD is invalid after the global.
2021-03-18 20:10:43.993 [WARNING] CDR0000800326: Failed link target rule: /GlossaryTermName/TermNameStatus != "Rejected" 2021-03-18 20:10:43.993 [WARNING] CDR0000800326: Non-publishable version will be created. 2021-03-18 20:10:43.996 [WARNING] CDR0000800326: b'Failed link target rule: /GlossaryTermName/TermNameStatus != "Rejected"
I see what you mean. I thought you were referring to documents becoming invalid because the AltTitle being removed. This change, as I mentioned, won't have any affect on a document being valid or not.
If a linked document isn't valid or doesn't exist anymore then, Yes, this will create an invalid document which will need to be corrected prior to publishing an updated version of the document but you still have the existing last publishable version sitting around until that time comes.
2021-03-24 15:01:58.636 [INFO] Run completed.
{{ Docs examined = 554}}
{{ Docs changed = 0}}
{{ Versions changed = 1573}}
{{ Could not lock = 0}}
{{ Errors = 0}}
{{ Time = 1:36:00.753793}}
The test results for the Global Change are available on QA.
Here is the log file. GlobalChange_QA.txt
We may need to talk about this a bit more to find a solution since there are several documents that fall into this category. The problem is that, the warning being reported in the logs is not really a problem we need to fix in the CDR. I think it is pointing to the fact the glossary term has a definition that is rejected. There are no plans to fix it in XMetal. That is, the definition will remain rejected in the CDR unless we want to "fix it" before running the global in live mode and then "unfix it" after that to prevent invalidating the summary documents on PROD. We will like to avoid having several invalid documents on PROD after the global.
We could extract the CDR-IDs for documents with warnings from the log file and exclude these from the Global Change. That would result in 86 documents to be excluded from the Global Change. That's about 10% of summaries. However, not all of the warnings are a result of a non-publishable link target. Someone would need to make a decision if only a specific warning should be excluded or all of them.
Excluding the affected documents from the global run should be fine but that will depend on how many of the warnings are the result of the non-publishable link target. Could you please provide a list of the documents that have the different types of warning?
Please exclude blocked summaries as we won't fix any errors in those documents.
Please exclude blocked summaries as we won't fix any errors in those documents.
Are you asking to exclude blocked summaries from being processed by the global change or to process blocked summaries and exclude them from being reported because of the warnings?
Looking at the log file I see 9 documents that are not blocked with validation warnings.
Please run in live mode on QA.
The live run on QA completed.
I identified the following document that aren't blocked to include warnings:
62890, 256677, 256685, 587224, 772163, 784073, 792723, 797908, 802226
I'm attaching the log file AltTitle_QA_live.log.
{{ Docs examined = 554}}
{{ Docs changed = 554}}
{{ Versions changed = 1169}}
{{ Could not lock = 0}}
{{ Errors = 0}}
{{ Time = 2:46:19.686110}}
Specific versions saved:
{{ new cwd = 39}}
{{ new pub = 465}}
{{ new ver = 150}}
{{ old cwd = 128}}
Looks good on QA. Please run in test mode on PROD.
{{ Docs examined = 553}}
{{ Docs changed = 0}}
{{ Versions changed = 1570}}
{{ Could not lock = 0}}
{{ Errors = 0}}
{{ Time = 1:21:51.319073}}
The diff files for the test run on PROD are now available on DEV.
I'm attaching the log file for the run. AltTitle_PROD_test.log
I see the following documents that are not blocked with warnings:
256685, 587224, 772163, 784073, 792723, 797908, 802226
Looks good from test results. Please proceed to run in live mode on PROD. Thanks!
It looks like the live run was completed last Thursday. I can see the changes on PROD.
I had forgotten to include the statistics for the live run on PROD. Here it is:
2021-04-08 22:34:22.941 [INFO] Run completed.
{{ Docs examined = 554}}
{{ Docs changed = 553}}
{{ Versions changed = 1165}}
{{ Could not lock = 1}}
{{ Errors = 0}}
{{ Time = 2:20:39.029035}}
Specific versions saved:
{{ new cwd = 40}}
{{ new pub = 464}}
{{ new ver = 148}}
{{ old cwd = 123}}
File Name | Posted | User |
---|---|---|
AltTitle_PROD_test.log | 2021-04-07 14:03:12 | Englisch, Volker (NIH/NCI) [C] |
AltTitle_QA_live.log | 2021-03-31 15:56:47 | Englisch, Volker (NIH/NCI) [C] |
GlobalChange_QA.txt | 2021-03-24 16:36:18 | Englisch, Volker (NIH/NCI) [C] |
GlobalChangeLog.txt | 2021-03-19 11:52:21 | Englisch, Volker (NIH/NCI) [C] |
Elapsed: 0:00:00.001453