Issue Number | 3436 |
---|---|
Summary | Publishing too many Closed Protocols |
Created | 2011-10-21 12:47:58 |
Issue Type | Improvement |
Submitted By | Englisch, Volker (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2011-11-17 14:16:01 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107764 |
BZISSUE::5130
BZDATETIME::2011-10-21 12:47:58
BZCREATOR::Volker Englisch
BZASSIGNEE::Volker Englisch
BZQACONTACT::Alan Meyer
I found a little bug in the system but I am not sure yet how to solve it.
Margaret had me look at our publishing schedule for the individual document types when I've noticed that some of the closed InScopeProtocols have been published over and over again as part of our nightly publishing without having been modified.
Our rule for publishing closed InScopeProtocols over night is to
include those documents that have a new version which does not exist on
Cancer.gov yet.
It appears we're having a problem with new versions of protocols that do
not trigger a push to Cancer.gov because of administrative changes
only.
When this happens and a new version is being created the publishing job
will look in the table 'pushed_doc' to find the latest version of a
document. If a newer version of the document exists, the protocol is
being selected for publishing. Once the document has been published and
we diff the old and new versions of the document to identify if the new
version needs to be pushed the pushed_doc table (actually a view) does
never get updated to the new version because the new version never gets
pushed if the XML hasn't changed.
This catch-22 results in currently about 350 protocol documents that are being published every single night without ever being pushed to Cancer.gov.
One way to "resolve" this issue could be to re-publish those documents and force them to be updated on Cancer.gov, therefore updating the last version listed in pushed_doc but this would not take care of the problem for future cases.
As an example, please look at CDR63440. This document has been published every single publishing night since July 2nd, 2010 (329 unnecessary updates since that day).
Long story short - we should try to fix this.
BZDATETIME::2011-10-26 13:42:30
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1
The problem is that the pushed_doc table is too restrictive. We
shouldn't be checking in that view/SQL-statement if a document has been
pushed to Cancer.gov but if a new version for the document exists that
hasn't been published yet.
If a new version does exist but the version has not been published it
should be picked up for publishing (this part currently works
correctly). But if a new version exists and it has already been
published once before it shouldn't be published again - at least not as
part of the nightly publishing job where we're ignoring changes in
linked documents.
I believe we can fix this problem by looking at the view published_doc
instead of pushed_doc in our selection criteria. I successfully tested
this change on MAHLER but wanted to see if anyone can think of reasons
why this wouldn't work correctly.
BZDATETIME::2011-11-10 10:24:49
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2
The publishing document has been updated on FRANCK and BACH.
178.xml (document version 56)
I will monitor the result of this change after tonight's publishing job.
BZDATETIME::2011-11-11 10:38:13
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3
We were publishing on average
406 CTGovProtocols and
344 Closed protocols
every night. After installing the new publishing document on BACH we
published last night
117 CTGovProtocols and
12 Closed protocols.
It appears that around 600 documents had been published unnecessarily each night in the past.
I believe we can close this issue now.
BZDATETIME::2011-11-17 14:16:01
BZCOMMENTOR::Volker Englisch
BZCOMMENT::4
No problems with production identified so far.
Closing issue.
Elapsed: 0:00:00.000501