CDR Tickets

Issue Number 4446
Summary New Published Glossary Terms report broken
Created 2018-04-05 16:22:59
Issue Type Bug
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2018-04-06 14:32:12
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.223654
Description

As discussed in the CDR meeting this afternoon, the New Published Glossary Terms appears to be broken when you run the report for the month of March. No documents are retrieved but new documents were published in the month of March. Here are 2 GTNs that were published in March that we expect to display on the report.

791735 pseudomyogenic hemangioendothelioma
791790 rare cancer

Comment entered 2018-04-05 16:49:32 by Englisch, Volker (NIH/NCI) [C]

I'm reassigning this to you, .
From what I can tell there is nothing wrong with the report. It's working as expected. The report is using the column first_pub from the document table to limit the results to the specified date range. I'm pretty certain that we're not populating this column anymore since we went live with Gauss.
I've only checked the GTN documents at this point and it appears the last time first_pub has been populated was Feb. 14th. We went live with Gauss on Feb. 15th.

Comment entered 2018-04-05 19:09:13 by Kline, Bob (NIH/NCI) [C]

I have a fix for this. It is a serious enough problem that I propose we include it in the Hawking release. I also have a script for filling in the missing first_pub values. We can talk tomorrow about testing the fix.

Comment entered 2018-04-06 10:42:44 by Englisch, Volker (NIH/NCI) [C]

Yes, I agree. This needs to be fixed as part of Hawking or a special hot-fix to avoid the need for a larger data cleanup effort.

Is it sufficient to populate the missing dates from the publishing job date of the first published version of the doc_version table or is the cleanup more complicated than that?

Comment entered 2018-04-06 12:10:07 by Kline, Bob (NIH/NCI) [C]

: I'm trying to test my fix on DEV, and I keep getting stuck waiting for GateKeeper to confirm push jobs. Could you please kick whichever server needs waking up from its long nap? Thanks!

Comment entered 2018-04-06 12:40:38 by Englisch, Volker (NIH/NCI) [C]

Successfully kicked!
The GK processing queue was turned off, probably due to CBIIT's restart of the systems the other day.

Comment entered 2018-04-06 14:31:48 by Kline, Bob (NIH/NCI) [C]

Thanks.

Here's the code for finding the missing first_pub values (the doc_version table doesn't enter the equation at all).

  SELECT document.id, MIN(pub_proc.started) AS first_pub_date
    FROM document
    JOIN pub_proc_doc
      ON pub_proc_doc.doc_id = document.id
    JOIN pub_proc
      ON pub_proc.id = pub_proc_doc.pub_proc
   WHERE pub_proc.status = 'Success'
     AND pub_proc.completed IS NOT NULL
     AND pub_proc.pub_system = 178
     AND pub_proc_doc.removed = 'N'
     AND pub_proc_doc.failure IS NULL
     AND document.first_pub_knowable = 'Y'
     AND document.first_pub IS NULL
GROUP BY document.id

I have added the missing values on QA and PROD (the only two tiers which had any newly published documents). I will take care of any new gaps which occur before the fix to the publishing code has been promoted. I have installed and tested the fix to the publishing code on DEV. Once William and/or Volker have tested on DEV I will promote the fix to QA. (I imported and hot-fixed a couple of Term documents from the thesaurus for my testing.)

Comment entered 2018-04-06 16:26:35 by Englisch, Volker (NIH/NCI) [C]

Didn't you publish a bunch of terms on DEV to test today? The report is still empty when I try to run it on DEV.

Comment entered 2018-04-06 16:33:10 by Kline, Bob (NIH/NCI) [C]

Yes, but Term documents are not the same as GlossaryTermName documents. I used terms to test the fix for the bug you found in the publishing software because they're easier to create (using the Term Import interface).

Comment entered 2018-04-06 16:36:29 by Englisch, Volker (NIH/NCI) [C]

Oh, I see. I just saw those hot-fixes flying and thought you were publishing glossary docs.

... and yes, I do use the words terms and glossaries sometimes to confuse myself and others. :-)

Comment entered 2018-04-06 18:39:21 by Englisch, Volker (NIH/NCI) [C]

, could you prepare a few document on DEV so we can test this report as part of Hawking?

Comment entered 2018-04-09 10:52:28 by Osei-Poku, William (NIH/NCI) [C]

These two terms are ready for testing:

783210 (distal extrahepatic bile duct)
783212 (senses)

Comment entered 2018-04-09 11:09:41 by Englisch, Volker (NIH/NCI) [C]

Looks good to me. The first_pub dates are populated and the report lists both of the terms after they have been pushed to Gatekeeper. Thumbs up from me. 👍

Comment entered 2018-04-09 12:42:26 by Osei-Poku, William (NIH/NCI) [C]

Could you please hot-fix this term as well 783214 ?

Comment entered 2018-04-09 13:00:21 by Englisch, Volker (NIH/NCI) [C]

Done.

Comment entered 2018-04-10 12:33:39 by Osei-Poku, William (NIH/NCI) [C]

Thanks!

Comment entered 2018-04-10 12:33:52 by Osei-Poku, William (NIH/NCI) [C]

Verified on QA.

Comment entered 2018-04-10 14:22:26 by Kline, Bob (NIH/NCI) [C]

Checked in and installed on QA.

https://github.com/NCIOCPL/cdr-lib/commit/df69a92

Comment entered 2018-05-09 13:38:03 by Osei-Poku, William (NIH/NCI) [C]

Verified on PROD. Thanks!

Elapsed: 0:00:00.001509