CDR Tickets

Issue Number 4499
Summary [Citations] Non-publishable citations on Update Pre-Medline Citations report
Created 2018-07-09 16:13:46
Issue Type Bug
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2019-08-05 17:00:53
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.228941
Description

The Update Pre-Medline Citations report has been importing updated citations without making publishable version. This appears to be similar to the OCECDR-4384 during the Gauss release.

Comment entered 2018-09-05 12:19:00 by Kline, Bob (NIH/NCI) [C]

I have examined the code and verified that the call to cdr.repDoc() needs to have the publishable=True argument added. This ticket can be added to Joule.

Comment entered 2019-06-18 09:37:47 by Kline, Bob (NIH/NCI) [C]

Fixed on DEV. Please test.

Comment entered 2019-07-03 11:28:06 by Osei-Poku, William (NIH/NCI) [C]

We can't tell if the report is working well. Some of the updated documents display DTD errors:

CDR0000793264

CDR0000793438

Non publishable versions were also created for some of them.

CDR0000792451

CDR0000792650

Comment entered 2019-07-03 11:59:23 by Kline, Bob (NIH/NCI) [C]

It might make the most sense to test that aspect of this issue once  and you have made a decision about what we're going to preserve for Citations and the ticket for OCECDR-4561 has been implemented, don't you think?

Comment entered 2019-08-05 13:31:36 by Osei-Poku, William (NIH/NCI) [C]

Several citations imported without publishable versions.
796586
794330
At least two other citations display a DTD error message.
791338
796641

Comment entered 2019-08-05 14:24:41 by Kline, Bob (NIH/NCI) [C]

Ah, we never connected OCECDR-4561 with this script. That issue dealt with

Modify the import/update script (CiteSearch.py) to use the new filter

... as #5 under

Subtasks would include: ...

Let me apply the changes for that ticket to this script and you can test again.

Comment entered 2019-08-05 17:00:53 by Kline, Bob (NIH/NCI) [C]

Now this script has the changes from OCECDR-4561. I also found and fixed another bug which had been reported for CiteSearch.py but not for this script (the publishable flag was not being set to "Y"). Then I ran a modified version of this script to fool it into re-importing the citations from NLM even though the statuses hadn't changed since the last time your ran it, and even though some of the statuses which resulted from your run made the citations no longer eligible for import by this script. You can bring up https://cdr-qa.cancer.gov/cgi-bin/cdr/CdrQueries.py and run the Citations refreshed for OCECDR-4499 query to see that the validation status and the publishable flag are as they should be. The script's query has been restored back from the one I used to fix the data. You can't do any more testing of this script until NLM changes some of the statuses but I have applied the fixes to DEV, so you might be able to test there.

I factored out the processing of what we get from NLM into common code used by both scripts (this one and CiteSearch.py), so that we won't have similar bugs in the future (the transformation of the XML gets modified in one script but we forget about the other one), so you will probably want to re-test CiteSearch.py (I've done some testing but testing by the users is always the best).

Comment entered 2019-08-06 09:14:07 by Osei-Poku, William (NIH/NCI) [C]

We were able to confirm that changes from DEV. Thanks

Comment entered 2019-09-05 13:59:07 by Osei-Poku, William (NIH/NCI) [C]

Verified on PROD. Thanks!

Elapsed: 0:00:00.001559