CDR Tickets

Issue Number 3588
Summary Links from Drug Dictionary to NCI Thesaurus
Created 2013-03-06 15:13:46
Issue Type Improvement
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2013-09-26 09:02:57
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107916
Description

BZISSUE::5287
BZDATETIME::2013-03-06 15:13:46
BZCREATOR::William Osei-Poku
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku

NCI thesaurus concept codes/IDs are added to Term documents (using the NCIThesaurusConcept element) which are in turn used to create links from the Drug Dictionary to the NCI Thesaurus on Cancer.gov. However, there is always a lag between when we manually add the Concept code to the CDR term document and when the ID (term) is made available in the thesaurus (by NCI thesaurus staff) During this period, the link doesn't work from the Drug Dictionary on Cancer.gov to the NCI Thesaurus. As suggested by Margaret in the CIAT meeting yesterday, we want to implement a solution similar to the one for Glossary terms that have not been published yet but have been linked to by other CDR documents. We will like to discuss this and explore other solutions to the problem.

Comment entered 2013-08-20 14:56:27 by chengep

Let's talk about the options at this Thursday's meeting.

Comment entered 2013-08-29 14:31:09 by Kline, Bob (NIH/NCI) [C]

This will involve:

1. A new attribute on the Term document (thesaurus record is public)
2. Filter change to avoid creating link in the published doc if thesaurus record isn't public
3. Report of Term documents with thesaurus links not marked public
4. Possible enhanced version of 3 to include realtime check of thesaurus
5. Global change job to populate existing Term docs with the new attribute

Comment entered 2013-09-05 10:01:32 by Osei-Poku, William (NIH/NCI) [C]

Mary and I discussed this report and she indicated that this will be helpful if there is a real-time check of the thesaurus as suggested in #4 above. The inclusion of the possible new attribute and changes to the vendor fitter will fix the dead links on Cancer.gov. Reporting thesaurus links not marked public (#3) will mean checking about 30 to 40 terms each week to see they are available on the public web site of the NCI Thesaurus. Currently she is only checking the 'high profile' terms which is just a few terms.

Comment entered 2013-09-23 11:10:08 by Kline, Bob (NIH/NCI) [C]

Volker:

Can I help with parts of this? How about if I do the schema change, the report, and the global change, leaving you to wrestle with the filter changes?

Comment entered 2013-09-23 11:20:53 by Englisch, Volker (NIH/NCI) [C]

That would certainly help a great deal. Aren't you already working on the next EBMS release?

Comment entered 2013-09-23 11:35:07 by Kline, Bob (NIH/NCI) [C]

Not yet. I'll jump in those parts of this task.

Comment entered 2013-09-24 23:04:14 by Kline, Bob (NIH/NCI) [C]

Schema has been modified on DEV, and a test-mode global change has been run. Please review the results:

https://cdr.dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2013-09-24_17-26-01

Comment entered 2013-09-25 15:37:30 by Kline, Bob (NIH/NCI) [C]

In order to keep on track for the release schedule, we'll need to keep the momentum going on this issue. I'm going to proceed with a live run of the global change on DEV soon, without waiting too much longer for the review of the test run.

Comment entered 2013-09-25 16:30:42 by Englisch, Volker (NIH/NCI) [C]

William, could you give me a sample of one of these terms with a broken link on Cancer.tov?

Comment entered 2013-09-25 16:38:31 by Englisch, Volker (NIH/NCI) [C]

Ignore my last comment. I found what I was looking for.

When there is a NCIThesaurusConcept ID the text displayed is:

Check for active clinical trials or closed clinical trials using this agent. (NCI Thesaurus)

I thought we wanted to remove the link like we do with the glossary terms but that wouldn't make sense here since the text (NCI Thesaurus) would still be displayed.
Am I correct that we want to remove the text (NCI Thesaurus) when the concept ID isn't public?

Comment entered 2013-09-25 16:44:23 by Beckwith, Margaret (NIH/NCI) [E]

Yes, I think that text should be removed if we can't make the link.

Comment entered 2013-09-25 17:46:32 by Englisch, Volker (NIH/NCI) [C]

The following filter has been modified in order to remove the NCI Thesaurus link:

  • CDR000134.xsl: Vendor Filter: Term

I will run a couple of tests once the data has been updated with the new attribute.

Comment entered 2013-09-26 06:41:18 by Osei-Poku, William (NIH/NCI) [C]

I have reviewed several of the terms in the test run and the live documents and new attribute appears to have been added correctly. But I do have a question about the timing of the global. Was the global applied only to terms that are in the thesaurus or it was applied to all terms?

Comment entered 2013-09-26 06:57:12 by Kline, Bob (NIH/NCI) [C]

I'm not sure what the relationship between scope and timing would be, but since we decided to use the same Public attribute we have on email addresses, the only valid value is "Yes" so yes, the attribute was only added to terms which are in the publicly available database, not the ones you get in advance of publication on the spreadsheets sent to you. If you would prefer to change the definition of this attribute to allow values of both "Yes" and "No" we can make the attribute required and apply it to all NCIThesaurusConcept attributes. That would actually make the report software for this issue more efficient.

Comment entered 2013-09-26 07:04:24 by Kline, Bob (NIH/NCI) [C]

One tip for the folks entering the concept IDs: they are treated as case-sensitive by the thesaurus software, so you may find some marked as not yet public which you didn't expect. For example, c97336 is not found, but C97336 is.

Comment entered 2013-09-26 07:08:18 by Osei-Poku, William (NIH/NCI) [C]

Volker,
For test purposes, I have modified the following terms so that you can run your tests:

CDR0000539704
CDR0000539695
CDR0000539100
CDR0000038786
CDR0000540668
CDR0000539705
CDR0000543726
CDR0000544572
CDR0000544718
CDR0000544742
CDR0000544743
CDR0000041197

Comment entered 2013-09-26 07:19:52 by Osei-Poku, William (NIH/NCI) [C]

Bob,
I asked the question because from my testing, I didn't come across any terms that had not been assigned the new attribute so I assumed that all the terms with concept IDs had been assigned the new value (whether they were in the thesaurus or not). One more question for you, what is the easiest way to find terms with concept ids that are not yet in the thesaurus?

Comment entered 2013-09-26 08:11:35 by Kline, Bob (NIH/NCI) [C]

I'll answer that question as soon as I have finished implementing step 3 above. :-)

Comment entered 2013-09-26 09:01:02 by Kline, Bob (NIH/NCI) [C]

Report has been implemented on DEV (including real-time check of thesaurus for item 4 above, reflected in third column of report). Ready for user testing:

https://cdr.dev.cancer.gov/cgi-bin/cdr/ocecdr-3588.py

Comment entered 2013-09-26 09:02:57 by Kline, Bob (NIH/NCI) [C]

All five of the tasks for this issue have been implemented (on DEV).

Comment entered 2013-09-26 12:40:22 by Osei-Poku, William (NIH/NCI) [C]

Please add two columns to the report and display data from the DateLastModified and SemanticType elements.

Comment entered 2013-09-26 12:44:41 by Englisch, Volker (NIH/NCI) [C]
For test purposes, I have modified the following terms so that you can run your tests:

On which machine is this - QA or DEV?

Comment entered 2013-09-26 12:46:40 by Osei-Poku, William (NIH/NCI) [C]

On DEV.

Comment entered 2013-09-26 13:01:36 by Englisch, Volker (NIH/NCI) [C]

I finished testing. The link to the NCI Thesaurus is dropped on DEV and listed on QA.

Comment entered 2013-09-26 14:57:06 by Kline, Bob (NIH/NCI) [C]

The two new columns have been added to the report.

Comment entered 2013-09-27 12:52:51 by Osei-Poku, William (NIH/NCI) [C]

Verified on DEV.

Comment entered 2013-09-27 13:37:59 by Kline, Bob (NIH/NCI) [C]

Did you want the value to be applied to all the NCIThesaurusConcept elements, using "Yes" and "No" as the valid values?

Comment entered 2013-09-27 13:52:35 by Englisch, Volker (NIH/NCI) [C]

The following filter has been versioned in SVN:

  • R12050: CDR000134.xml (Vendor Filter: Term)

The filter needs to be installed using the following command (typed on one line):

 $ updateFilter.py <username> <password> CDR0000000134.xml 
                   --docid=134 --version=Y  
                   --publishable=Y 
                   --comment="R12050 (OCECDR-3588): Suppress NCI Thesaurus link"
Comment entered 2013-09-27 13:55:24 by Osei-Poku, William (NIH/NCI) [C]

That will be good. It will leave little or no room for misinterpretation. Also, please place the report on the reports menu under Terminology/Other Reports

Comment entered 2013-10-07 13:10:52 by Kline, Bob (NIH/NCI) [C]

Changes made; ready for review on DEV. You have CDR38786 locked, William, so the global wasn't able to update that document.

Comment entered 2013-10-07 15:03:35 by Osei-Poku, William (NIH/NCI) [C]

I tried to manually update one record to set the value to 'No' but I am getting a schema validation error (DEV).

Comment entered 2013-10-07 15:10:56 by Kline, Bob (NIH/NCI) [C]

Did you log out of XMetaL and back in?

Comment entered 2013-10-07 15:35:03 by Osei-Poku, William (NIH/NCI) [C]

I just did and I can now see the new value. All changes are now verified.

Comment entered 2013-10-08 10:04:33 by Kline, Bob (NIH/NCI) [C]

The report has also been installed on the admin menu as requested (DEV).

Comment entered 2013-10-08 10:48:31 by Osei-Poku, William (NIH/NCI) [C]

Yes. I did verify the report on DEV yesterday.

Comment entered 2013-11-07 13:19:16 by Osei-Poku, William (NIH/NCI) [C]

I get the following error when I run the report from the admin menu:

502 - Web server received an invalid response while acting as a gateway or proxy server.

There is a problem with the page you are looking for, and it cannot be displayed. When the Web server (while acting as a gateway or proxy) contacted the upstream content server, it received an invalid response from the content server.

Comment entered 2013-11-07 15:18:46 by Kline, Bob (NIH/NCI) [C]

I was able to run the report without any problems, so perhaps this was a temporary glitch (possibly caused by sluggish performance on the QA server, which is abysmally slow). Could you give it another try, please?

Comment entered 2013-11-07 15:30:26 by Osei-Poku, William (NIH/NCI) [C]

Yes. I am able to run the report successfully now. Thanks!

Verified on QA.

Comment entered 2013-11-21 16:48:12 by Kline, Bob (NIH/NCI) [C]

The global change job is running on PROD, but it will take a while (almost 10K docs to update). Hold off on trying the report until I let you know the global change job has finished.

Comment entered 2013-11-22 08:51:46 by Kline, Bob (NIH/NCI) [C]

The global change job completed successfully. You can run the report on production now.

Comment entered 2013-11-22 09:05:52 by Osei-Poku, William (NIH/NCI) [C]

It appears the schema changes are not installed yet. We are getting DTD validation errors when accessing some of the terminology files with the new attribute like this this term -CDR0000042613. I have logged out and logged back into XMetal with the same results.

Comment entered 2013-11-22 09:43:00 by Kline, Bob (NIH/NCI) [C]

The repository had the schema changes (so the global change job did not invalidate documents which would otherwise be valid), but somehow the job to update the DTD for the client didn't take. The problem has been corrected. Please log out, log back in, and try again.

Comment entered 2013-11-26 19:10:16 by Osei-Poku, William (NIH/NCI) [C]

Verified on Prod. and Cancer.gov

Elapsed: 0:00:00.000828