Issue Number | 4464 |
---|---|
Summary | [Publishing] Ability to Display Filter Messages |
Created | 2018-04-24 18:44:38 |
Issue Type | Improvement |
Submitted By | Englisch, Volker (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2018-07-05 09:20:02 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.224841 |
Before Gauss we were collecting errors and warnings encountered during publishing and saved those messages to the pub_proc_doc table. In addition we listed the number of errors encountered during a publishing job in the job output. The errors and warnings could be displayed with the PubStatus report directly from the job message displayed. Currently, we’ll have to pull up the Publishing Job Summary report.
I noticed that we haven’t been recording any errors or warnings since
we put our Gauss release in place (before Feb. 23).
We used to see messages from the filters like these:
<Messages>
<message>
<LI class="warning">
MediaLink[language='es']<BR/>
Spanish MediaLink element exists but Spanish Term Definition
missing.</LI>
</message>
</Messages>
In addition, the messages file also listed information in case a push job had been reverified on the Gatekeeper server in order to correct possible GK processing errors.
It would be useful to capture this information have a way to get to it again.
If you would like to have this addressed in the next release, go ahead and add it to Ising.
I'm considering a couple of approaches. One would be to log the
warnings using the standard logging
mechanism, possibly in
a log file separate from export-docs.log. The other would be to store
the warnings in a serialized form using Python's eval(...)
to preserve lists of error strings. It is unfortunate that some of the
error and warning strings emitted by the filters appear to be wrapped by
those filters in HTML markup. How pervasive is that practice? In
general, we should defer presentation decisions as late in the process
as possible.
One would be to log the warnings using the standard logging mechanism, possibly in a log file separate from export-docs.log.
In this case we should modify the job notification email "Status and Error Report for Nightly Publishing". This provided a link allowing us to see any errors or warnings with a single click. The output for
https://cdr.cancer.gov/cgi-bin/cdr/PubStatus.py?id=16550&type=FilterFailure&flavor=error or
https://cdr.cancer.gov/cgi-bin/cdr/PubStatus.py?id=16550&type=FilterFailure&flavor=warning
is currently useless. It would also require us to copy/paste the errors/warnings found and forward to William. William had access to the links/error report on the email notification.
It is unfortunate that some of the error and warning strings emitted by the filters appear to be wrapped by those filters in HTML markup.
We needed a way to distinguish between warnings and errors and this was what worked under the given conditions. The code was written about 10-12 years ago and did what it needed to do. I won't argue that code written 10 years ago couldn't be improved today. :-)
How pervasive is that practice?
I'm aware of 2 or 3 places of which one had been eliminated with the rewrite of Gauss.
I have modified export-docs.py
to store any warnings and
errors as a serialized (repr(...)
) sequence of strings, and
the modified version is installed on DEV. I ran a nightly publishing job
to make sure I haven't broken anything that was working when there are
no errors and warnings present, but ~volker has agreed to do the work of making
publishing jobs (with ~oseipokuw's help) which cause documents to
fail or emit warning messages. Once they've satisfied themselves that
this part is working correctly, I will make the necessary modifications
to the publishing job status report script.
I've been trying all afternoon to recreate the error messages we used
to get as part of the publishing output but without success. Maybe ~oseipokuw can tell me what I
was doing wrong but I'm certain the following scenario should have
resulted in a warning message:
I've added the GlossaryTerm PRES (CDR793495) for which no
publishable version exists to display at part of the summary CDR350260.
After running a hot-fix job the filter should have created a warning
message similar to this:
GlossaryTermRef (CDR782476), Publishable Version of linked Document
does not exist.
However, there is no entry in the pub_proc_doc table for
document 350260 (pub_proc = 16568).
I may be doing something wrong but at the moment I don't think any messages are being created.
You weren't doing anything wrong, there was a bug in export_docs.py, which was failing to commit the UPDATE query to get the messages in the pub_proc_doc rows. See
SELECT *
FROM pub_proc_doc
WHERE pub_proc = 16569
Now I need to find out why the subdir
column has the
HTML for a non-breaking space (which JIRA won't let me put directly into
this comment) in it for hotfix jobs.
Should I wait or can I continue with testing?
Please continue. the subdir
problem is just something I
noticed while tracking down the commit bug, and doesn't have any effect
on this ticket.
Turned out to be a bug in the CdrQueries.py
CGI script,
not the publishing system. Fixed. Back to your regularly scheduled
programming. :-)
I've been able to create a couple of warning messages but I wasn't able to produce error messages. This is in part because we don't publish CTGovProtocols anymore and these were documents most often including DTD errors and in part because we've improved our validation.
Maybe ~oseipokuw will have more success producing error messages during our QA testing.
Why don't you make a temporary modification to the DTD, calculated to make validation fail, and then revert the DTD after your tests?
I've been able to create errors and warnings now and the messages are stored in the pub_proc_doc table.
I did notice that the process to validate a document using the
Filter Document interface will display warnings and error while
the publishing process only lists the error when a document contains
both.
I don't know if this is expected. My test document was CDR62902.
Yes, that's expected. In fact we discussed this, in the context of realizing that it would let us look at the failure column to determine whether the messages were for errors or warnings.
The PubStatus.py script has been modified to display warnings/errors stored as a serialized list. Will this be sufficient (~volker and ~oseipokuw), or do we need to have a path which handles older publishing jobs?
In my opinion we don't need to do anything more.
The filter messages are displayed on QA.
The filter messages are displaying again on PROD.
Closing ticket.
Elapsed: 0:00:00.001113