CDR Tickets

Issue Number 3436
Summary Publishing too many Closed Protocols
Created 2011-10-21 12:47:58
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2011-11-17 14:16:01
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107764
Description

BZISSUE::5130
BZDATETIME::2011-10-21 12:47:58
BZCREATOR::Volker Englisch
BZASSIGNEE::Volker Englisch
BZQACONTACT::Alan Meyer

I found a little bug in the system but I am not sure yet how to solve it.

Margaret had me look at our publishing schedule for the individual document types when I've noticed that some of the closed InScopeProtocols have been published over and over again as part of our nightly publishing without having been modified.

Our rule for publishing closed InScopeProtocols over night is to include those documents that have a new version which does not exist on Cancer.gov yet.
It appears we're having a problem with new versions of protocols that do not trigger a push to Cancer.gov because of administrative changes only.
When this happens and a new version is being created the publishing job will look in the table 'pushed_doc' to find the latest version of a document. If a newer version of the document exists, the protocol is being selected for publishing. Once the document has been published and we diff the old and new versions of the document to identify if the new version needs to be pushed the pushed_doc table (actually a view) does never get updated to the new version because the new version never gets pushed if the XML hasn't changed.

This catch-22 results in currently about 350 protocol documents that are being published every single night without ever being pushed to Cancer.gov.

One way to "resolve" this issue could be to re-publish those documents and force them to be updated on Cancer.gov, therefore updating the last version listed in pushed_doc but this would not take care of the problem for future cases.

As an example, please look at CDR63440. This document has been published every single publishing night since July 2nd, 2010 (329 unnecessary updates since that day).

Long story short - we should try to fix this.

Comment entered 2011-10-26 13:42:30 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-10-26 13:42:30
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1

The problem is that the pushed_doc table is too restrictive. We shouldn't be checking in that view/SQL-statement if a document has been pushed to Cancer.gov but if a new version for the document exists that hasn't been published yet.
If a new version does exist but the version has not been published it should be picked up for publishing (this part currently works correctly). But if a new version exists and it has already been published once before it shouldn't be published again - at least not as part of the nightly publishing job where we're ignoring changes in linked documents.
I believe we can fix this problem by looking at the view published_doc instead of pushed_doc in our selection criteria. I successfully tested this change on MAHLER but wanted to see if anyone can think of reasons why this wouldn't work correctly.

Comment entered 2011-11-10 10:24:49 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-11-10 10:24:49
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2

The publishing document has been updated on FRANCK and BACH.
178.xml (document version 56)

I will monitor the result of this change after tonight's publishing job.

Comment entered 2011-11-11 10:38:13 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-11-11 10:38:13
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3

We were publishing on average
406 CTGovProtocols and
344 Closed protocols
every night. After installing the new publishing document on BACH we published last night
117 CTGovProtocols and
12 Closed protocols.

It appears that around 600 documents had been published unnecessarily each night in the past.

I believe we can close this issue now.

Comment entered 2011-11-17 14:16:01 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-11-17 14:16:01
BZCOMMENTOR::Volker Englisch
BZCOMMENT::4

No problems with production identified so far.
Closing issue.

Elapsed: 0:00:00.000501