CDR Tickets

Issue Number 4464
Summary [Publishing] Ability to Display Filter Messages
Created 2018-04-24 18:44:38
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2018-07-05 09:20:02
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.224841
Description

Before Gauss we were collecting errors and warnings encountered during publishing and saved those messages to the pub_proc_doc table. In addition we listed the number of errors encountered during a publishing job in the job output. The errors and warnings could be displayed with the PubStatus report directly from the job message displayed. Currently, we’ll have to pull up the Publishing Job Summary report.

I noticed that we haven’t been recording any errors or warnings since we put our Gauss release in place (before Feb. 23).
We used to see messages from the filters like these:
<Messages>
<message>
<LI class="warning">
MediaLink[language='es']<BR/>
Spanish MediaLink element exists but Spanish Term Definition missing.</LI>
</message>
</Messages>

In addition, the messages file also listed information in case a push job had been reverified on the Gatekeeper server in order to correct possible GK processing errors.

It would be useful to capture this information have a way to get to it again.

Comment entered 2018-05-18 16:56:25 by Kline, Bob (NIH/NCI) [C]

If you would like to have this addressed in the next release, go ahead and add it to Ising.

Comment entered 2018-06-15 13:22:41 by Kline, Bob (NIH/NCI) [C]

I'm considering a couple of approaches. One would be to log the warnings using the standard logging mechanism, possibly in a log file separate from export-docs.log. The other would be to store the warnings in a serialized form using Python's eval(...) to preserve lists of error strings. It is unfortunate that some of the error and warning strings emitted by the filters appear to be wrapped by those filters in HTML markup. How pervasive is that practice? In general, we should defer presentation decisions as late in the process as possible.

Comment entered 2018-06-15 14:02:13 by Englisch, Volker (NIH/NCI) [C]

One would be to log the warnings using the standard logging mechanism, possibly in a log file separate from export-docs.log.

In this case we should modify the job notification email "Status and Error Report for Nightly Publishing". This provided a link allowing us to see any errors or warnings with a single click. The output for

is currently useless. It would also require us to copy/paste the errors/warnings found and forward to William. William had access to the links/error report on the email notification.

Comment entered 2018-06-15 17:47:30 by Englisch, Volker (NIH/NCI) [C]

It is unfortunate that some of the error and warning strings emitted by the filters appear to be wrapped by those filters in HTML markup.

We needed a way to distinguish between warnings and errors and this was what worked under the given conditions. The code was written about 10-12 years ago and did what it needed to do. I won't argue that code written 10 years ago couldn't be improved today. :-)

How pervasive is that practice?

I'm aware of 2 or 3 places of which one had been eliminated with the rewrite of Gauss.

Comment entered 2018-06-20 15:25:02 by Kline, Bob (NIH/NCI) [C]

I have modified export-docs.py to store any warnings and errors as a serialized (repr(...)) sequence of strings, and the modified version is installed on DEV. I ran a nightly publishing job to make sure I haven't broken anything that was working when there are no errors and warnings present, but has agreed to do the work of making publishing jobs (with 's help) which cause documents to fail or emit warning messages. Once they've satisfied themselves that this part is working correctly, I will make the necessary modifications to the publishing job status report script.

https://github.com/NCIOCPL/cdr-publishing/commit/75a720b

Comment entered 2018-06-26 19:20:08 by Englisch, Volker (NIH/NCI) [C]

I've been trying all afternoon to recreate the error messages we used to get as part of the publishing output but without success. Maybe can tell me what I was doing wrong but I'm certain the following scenario should have resulted in a warning message:
I've added the GlossaryTerm PRES (CDR793495) for which no publishable version exists to display at part of the summary CDR350260. After running a hot-fix job the filter should have created a warning message similar to this:
GlossaryTermRef (CDR782476), Publishable Version of linked Document does not exist.
However, there is no entry in the pub_proc_doc table for document 350260 (pub_proc = 16568).

I may be doing something wrong but at the moment I don't think any messages are being created.

Comment entered 2018-06-27 10:34:12 by Kline, Bob (NIH/NCI) [C]

You weren't doing anything wrong, there was a bug in export_docs.py, which was failing to commit the UPDATE query to get the messages in the pub_proc_doc rows. See

SELECT *
  FROM pub_proc_doc
 WHERE pub_proc = 16569

Now I need to find out why the subdir column has the HTML for a non-breaking space (which JIRA won't let me put directly into this comment) in it for hotfix jobs.

Comment entered 2018-06-27 10:50:27 by Englisch, Volker (NIH/NCI) [C]

Should I wait or can I continue with testing?

Comment entered 2018-06-27 10:58:49 by Kline, Bob (NIH/NCI) [C]

Please continue. the subdir problem is just something I noticed while tracking down the commit bug, and doesn't have any effect on this ticket.

Comment entered 2018-06-27 11:24:17 by Kline, Bob (NIH/NCI) [C]

Turned out to be a bug in the CdrQueries.py CGI script, not the publishing system. Fixed. Back to your regularly scheduled programming. :-)

Comment entered 2018-06-27 13:05:28 by Englisch, Volker (NIH/NCI) [C]

I've been able to create a couple of warning messages but I wasn't able to produce error messages. This is in part because we don't publish CTGovProtocols anymore and these were documents most often including DTD errors and in part because we've improved our validation.

Maybe will have more success producing error messages during our QA testing.

Comment entered 2018-06-27 13:11:37 by Kline, Bob (NIH/NCI) [C]

Why don't you make a temporary modification to the DTD, calculated to make validation fail, and then revert the DTD after your tests?

Comment entered 2018-06-29 12:12:00 by Englisch, Volker (NIH/NCI) [C]

I've been able to create errors and warnings now and the messages are stored in the pub_proc_doc table.

I did notice that the process to validate a document using the Filter Document interface will display warnings and error while the publishing process only lists the error when a document contains both.
I don't know if this is expected. My test document was CDR62902.

Comment entered 2018-06-29 12:20:36 by Kline, Bob (NIH/NCI) [C]

Yes, that's expected. In fact we discussed this, in the context of realizing that it would let us look at the failure column to determine whether the messages were for errors or warnings.

Comment entered 2018-07-05 09:09:17 by Kline, Bob (NIH/NCI) [C]

The PubStatus.py script has been modified to display warnings/errors stored as a serialized list. Will this be sufficient ( and ), or do we need to have a path which handles older publishing jobs?

Comment entered 2018-07-06 09:40:12 by Englisch, Volker (NIH/NCI) [C]

In my opinion we don't need to do anything more.

Comment entered 2018-07-13 16:56:20 by Englisch, Volker (NIH/NCI) [C]

The filter messages are displayed on QA.

Comment entered 2018-08-09 14:35:39 by Englisch, Volker (NIH/NCI) [C]

The filter messages are displaying again on PROD.
Closing ticket.

Elapsed: 0:00:00.001113