Issue Number | 4105 |
---|---|
Summary | Summaries QC reports fail on unpublished/able documents |
Created | 2016-05-20 11:54:14 |
Issue Type | Bug |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2016-05-20 15:19:42 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.184546 |
The summaries QC reports fail to run successfully if they have links going to documents that have not been made publishable or are not published like glossary terms and modules. This appears to be a recent development.
Here are two examples:
774255
<Errors> <Err>XSLT error: code: 61 msgtype:error code:61 module:Sablotron URI:cdr:CDR0000622462/last line:2 msg:could not open document 'cdr:CDR0000780537/lastp' </Err> </Errors>
62843
<Errors> <Err>XSLT error: code: 61 msgtype:error code:61 module:Sablotron URI:cdrutil:/get-pv-num/CDR0000779396#_1909 line:1 node:attribute 'encoding' msg:could not open document 'cdr:CDR0000779396/lastp' </Err> </Errors>
Have the documents been made publishable now? I ran the QC report on both summaries on PROD and the report ran fine without any error.
The links have been removed from those summaries. You can try this one 425416
This appears to be a recent development.
You are correct. This is a result of the ticket to change the system so that Cancer.gov is able to display HP glossary definitions. That task required to denormalize Glossaries, which we didn't have to do in the past. The denormalization filter expects a publishable version to exist. As discussed at our status meeting when discussing misc. documents we always want to ensure a publishable version is picked. up.
If you need to run the report despite the fact that a glossary term is included that hasn't been made publishable yet, please use the Quick-and-Dirty 911-option on the QC reports menu.
If you need to run the report despite the fact that a glossary term is included that hasn't been made publishable yet, please use the Quick-and-Dirty 911-option on the QC reports menu.
We probably should discuss this in the CDR meeting. I didn't know that we were giving up the ability to run the RLSO and BU reports because of that issue.
No, you're not giving up the ability to run RLSO and BU reports but you may need to add another report option in order to run the reports when there are unpublished glossaries included.
Yes, we should discuss this on Thursday.
One of the errors (62843) came from an unpublishable Module (CDR0000779396). Is that for the same reason as it is for the glossary terms?
Yes, you wouldn't want an unpublishable module to be included in a publishable summary, so the module denormalization filter expects a publishable version to exist. That's pretty much the same for all linked documents.
It is not always the case the summary would be publishable. For example, while working on a new summary you want to be able to run the QC reports successfully. The link in the above summary that is producing an error is able to successfully run a report when the link to the module is in approved insertion markup. However, when you change it to proposed insertion markup, the report produces the error reported above. So, it seems to me that the problem has more to do with the revision level markup rather than the publishability of the document.
You are now adding another dimension that we hadn't talked about by
referring to the insertion/deletion markup . We shouldn't mix the
two.
The revision level markup is applied before the denormalization is
taking place. That means depending on the markup selected for the
insertion/deletion elements and the QC markup report options of what
revision level needs to be applied, the link may or may not get deleted
resulting in the failure of the report if the link does not get deleted.
Did you check if the module (or GlossaryTerm) was being displayed when
the report did not fail? It should be stripped out and that's why the
report isn't failing and you'll see different results for different
combinations of the revision markup.
Did you check if the module (or GlossaryTerm) was being displayed when the report did not fail?
It was the module that was displayed in the error message (see the original post and below as well). In this particular summary, there was no unpublishable glossary term. I know they appear to be two different issues but they happened about the same time for the superstitious part of me tell me that that were related :-).
62843
<Errors> <Err>XSLT error: code: 61 msgtype:error code:61
module:Sablotron URI:cdrutil:/get-pv-num/CDR0000779396#_1909 line:1
node:attribute 'encoding' msg:could not open document
'cdr:CDR0000779396/lastp' </Err> </Errors>
No, you're not giving up the ability to run RLSO and BU reports but you may need to add another report option in order to run the reports when there are unpublished glossaries included.
Yes, we should discuss this on Thursday
If it is a matter of adding an option, then I can create a ticket for it to be added. Seems a bit late now as it would have been better to have had it released with Darwin.
The option already exists. Look at the bottom of the QC report options page where you have the check box for Run Quick & Dirty report. This prevents the report from failing if a linked-to document doesn't exist yet.
Correct me if I am wrong but the Quick & Dirty report may ignore other restrictions/errors that we may want to QC. It is not ignoring only the unpublishable documents, right?
My understanding of the Quick and Dirty option/report is that it ignores errors and allows the report to run even when there are empty elements including metadata elements. I don't think running the report on the Quick and Dirty option in this case is going to be a helpful option as it is not only ignoring unpublishable documents but also other potential errors that could impact the publishability of the summary document. So, an option to only ignore the publishability of glossary terms is a better option for us.
I agree with William. While the Quick & Dirty report does work, we'll need to discuss a more suitable solution moving forward. The Q&D was meant to be an internal report for use in emergencies but we commonly include links to unpublishable documents (especially glossary terms) in the summaries, so an option to bypass the check for glossary term publishability only while still displaying the links would be beneficial. Perhaps we could even bypass this in the default report settings so we wouldn't have to generate the QC report twice once we realize it has invalid links.
Correct me if I am wrong but the Quick & Dirty report may ignore other restrictions/errors that we may want to QC. It is not ignoring only the unpublishable documents, right?
The idea for the QD report was to allow a report to run if it would fail for the regular QC report due to this type of denormalization error. To be honest, it's been many years since we've created the report and I would have to go back through our Bugzilla tickets to see if we included other exceptions but my feeling is that the denormalization problem was the only thing the report was getting around.
Since we've created the QD QC reports we seemed to have neglected it a bit and failed to keep it in sync with the regular QC reports. Maybe that's why you're thinking that the report excludes other things as well. I see, for instance, that the table and figure numbering isn't included in the filter sets and there are probably other changes to the QC filters that didn't make it to the QD QC filters.
OCECDR-3300 is the issue for creating the Q&D Report and it looks like the list of elements included in the exception is below:
TranslationOf
LOESummary /lastp
Diagnosis /lastp
SummaryRef /lastp
SummaryFragmentRef /lastp
MiscellaneousDocLink /lastp
MediaID /last
ProtocolRef
CitationLink /lastp
GlossaryRef
LOERef
Seems like list of exceptions is too long for this option to be really reliable.
It's likely this can be completed outside of a release (filter change only).
would you be able to create a new glossary term on DEV and link it to a (short) summary for testing?
New Term: CDR0000778700 (Zocor Test)
CDR0000062755 (English -HP Screening summary)
CDR0000258032 (English -Patient Screening summary)
CDR0000747922 (Spanish - HP Screening summary)
CDR0000750632 (Spanish -Patient Screening summary)
Please let me know if you need more documents for testing.
Zocor Test is a publishable document. I was thinking about a test term that causes the QC report to fail.
Zocor Test is a publishable document. I was thinking about a test term that causes the QC report to fail.
Sorry. I created a new term CDR0000778701 without making it publishable and added it to all the above summaries with the exception of CDR0000062755 which is checked out to you. Now when I run the the report, I get the following error message:
An error has occurred
<Errors> <Err>XSLT error: code: 61 msgtype:error code:61 module:Sablotron URI:cdrx:CDR0000778701/lastp line:1 msg:could not open document 'cdr:/last' </Err> </Errors>
I think I'm done with this change. Please take a look on DEV. I've successfully included a simple string without any associated document (not publishable, not versioned, not even unversioned).
The following filter has been updated:
Denormalization Filter: Summary GlossaryTerm (CDR780696)
It seems to be working fine now.Thanks!
We will do additional testing and let you know if there are any issues
or not.
Verified on DEV. Thanks!
The filter has been updated on QA:
R14172: CDR780696.xml (Denormalization Filter: Summary GlossaryTerm)
This appears to be working well on QA. It needs additional testing. I will update this ticket when we have completed testing on QA.
I posted the comment under the wrong ticket. I just moved it to the right ticket.
We have completed testing this change on QA. Everything looks good. Thank you!
The filter has been updated on STAGE:
R14172 (trunk): CDR780696.xml (Denormalization Filter: Summary
GlossaryTerm)
STAGE Filter ID: CDR777552
Verified on STAGE.
The filter has been updated on PROD:
R14172 (trunk): CDR780696.xml (Denormalization Filter: Summary GlossaryTerm)
I'm still running into this error on PROD (at least I think it's the same error). For document 62863 (Genetics of CRC):
<Errors> <Err>XSLT error: code: 61 msgtype:error code:61 module:Sablotron URI:cdrutil:/get-pv-num/ line:1 node:attribute 'encoding' msg:could not open document 'cdr:/lastp' </Err> </Errors>
Would you have a guess which non-published GlossaryTerm is causing the error? My guess is that this is not related to a GlossaryTerm unless I forgot to copy one of the filters to PROD.
The problem was a missing SummaryFragmentRef, something this ticket did not address.
I added the error reported above to OCECDR-4126
Elapsed: 0:00:00.001527