Issue Number | 4322 |
---|---|
Summary | Fix drug term links to Active Clinical Trials |
Created | 2017-10-05 15:49:36 |
Issue Type | Improvement |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2017-10-12 15:44:07 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.215335 |
I have posted our offline email exchanges below. This is a request to modify the vendor filters to use the concept code from the NCIThesaurusConcept element to create the "Check for active clinical trials" link instead of using the concept code from OtherName/SourceInformation/VacabularySource/SourceTermId. The NCIThesaurusConcept code is checked to make sure the code is public and the Public = Yes attribute is then selected or not depending on the availability of the concept code on the live site of the NCI Thesaurus.
From: Englisch, Volker (NIH/NCI) [C] volker@mail.nih.gov
Sent: Tuesday, October 03, 2017 5:03 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: RE: Drug Dictionary
I see that you marked the NCIThesaurusConcept ID as not-public. The
link for the ‘active clinical trials’, however, doesn’t look at that
element. Instead, it’s using the element
OtherName/SourceInformation/VacabularySource/SourceTermId
to create the link.
If this needs to be changed please enter a ticket to have the filter
updated.
Thanks,
Volker
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient
NCI: 240-276-6583
From: Osei-Poku, William William.Osei-Poku@icf.com
Sent: Tuesday, October 03, 2017 4:47 PM
To: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Subject: RE: Drug Dictionary
Oh. I missed your other email talking about check for active clinical trials. We do not maintain that information as far as I can tell. I think that is generated by the vendor filters. What we removed are the links to the NCI thesaurus which is generated using the Concept Codes we put it, which we have now removed.
From: Englisch, Volker (NIH/NCI) [C] volker@mail.nih.gov
Sent: Tuesday, October 03, 2017 4:05 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: RE: Drug Dictionary
Yes, these links should not exist anymore.
I still see the broken links displayed on Cancer.gov.
Thanks,
Volker
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient
NCI: 240-276-6583
From: Osei-Poku, William William.Osei-Poku@icf.com
Sent: Tuesday, October 03, 2017 3:55 PM
To: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Subject: RE: Drug Dictionary
Yes, these links should not exist anymore. We removed the Concept codes so that just like the glossary terms, the link will no longer be created. When we put the codes in the term documents, that prompts the links to be created. Now that we have removed the codes, in the term document on Cancer.gov, there wouldn’t be any links to the Thesaurus at all. Hope this helps, if not, please let me know.
Thanks,
William
From: Englisch, Volker (NIH/NCI) [C] volker@mail.nih.gov
Sent: Tuesday, October 03, 2017 3:08 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: RE: Drug Dictionary
You have hot-fixed those to do what? At least for the one I mentioned
originally (CCR2/CCR5 antagonist BMS-813160) the link
Check for active clinical trials using this agent.
still gives you a “Page not found” error.
Thanks,
Volker
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient
NCI: 240-276-6583
From: Osei-Poku, William William.Osei-Poku@icf.com
Sent: Tuesday, October 03, 2017 2:34 PM
To: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Subject: RE: Drug Dictionary
Yes, it did. Thanks! I have hot-fixed all of them. Thank you!
From: Englisch, Volker (NIH/NCI) [C] volker@mail.nih.gov
Sent: Tuesday, October 03, 2017 12:19 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: RE: Drug Dictionary
Does this work:
791110 - C139800
791109 - C139730
791108 - C139561
791106 - C139559
791105 - C139553
791104 - C139552
791103 - C139551
791102 - C139550
Thanks,
Volker
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient
NCI: 240-276-6583
From: Osei-Poku, William William.Osei-Poku@icf.com
Sent: Tuesday, October 03, 2017 12:10 PM
To: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Subject: RE: Drug Dictionary
Hi Volker,
It is difficult to identify the CDR documents using these links. Are you
able to provide me with a list of CDR IDs or drug names?
Thanks,
William
From: Englisch, Volker (NIH/NCI) [C] volker@mail.nih.gov
Sent: Tuesday, October 03, 2017 11:53 AM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: RE: Drug Dictionary
I do have a list but I think this is only a partial list that I was
given. I will check.
Here are the broken links I have received:
Broken URL Referring URL
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139550
https://www.cancer.gov/publications/dictionaries/cancer-drug?expand=C
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139551
https://www.cancer.gov/publications/dictionaries/cancer-drug?expand=O
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139552
https://www.cancer.gov/publications/dictionaries/cancer-drug?expand=C
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139553
https://www.cancer.gov/publications/dictionaries/cancer-drug/?expand=A&searchTxt=A&first=401&page=3
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139559
https://www.cancer.gov/publications/dictionaries/cancer-drug?expand=S
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139561
https://www.cancer.gov/publications/dictionaries/cancer-drug/?expand=A&searchTxt=A&first=401&page=3
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139730
https://www.cancer.gov/publications/dictionaries/cancer-drug/?expand=A&searchTxt=A&first=801&page=5
https://www.cancer.gov/about-cancer/treatment/clinical-trials/intervention/C139800
https://www.cancer.gov/publications/dictionaries/cancer-drug/?expand=T&searchTxt=T&first=201&page=2
Thanks,
Volker
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient
NCI: 240-276-6583
From: Osei-Poku, William William.Osei-Poku@icf.com
Sent: Tuesday, October 03, 2017 11:38 AM
To: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Subject: RE: Drug Dictionary
Hi Volker,
I apologies for this. It looks like we forgot to set the attribute for
this one as Not Public until there were on the public NCI thesaurus
site. This has been fixed and I will hot-fix it later today. Do you have
a list of the other terms that are affected?
Thanks,
William
From: Englisch, Volker (NIH/NCI) [C] volker@mail.nih.gov
Sent: Tuesday, October 03, 2017 9:35 AM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: Drug Dictionary
Hi William,
We have been notified that there are a bunch of drug terms with a
broken link, like “CCR2/CCR5 antagonist BMS-813160” (CDR791102). This
term lists the concept ID as C139550 but when you search for this ID in
the NCI thesaurus it doesn’t exist.
Why would we publish a term with a NCI-ID that’s not publicly
available?
I think all of the other drug terms are having similar issues.
Thanks,
Volker
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient
NCI: 240-276-6583
I noticed there exist terms without the NCIThesaurusConcept element
although a C-code exists and is published.
Should these terms be handled as if the C-code does exist or not?
They should be treated as if the concept code does not exist.I assume at this point we are assuming the only true source of the code is the NCIThesaurusConcept. Meanwhile, is it possible to get a list of such terms?
Another question:
A comment in the filter code specifies to suppress the active
clinical trials link if a DrugInfo Summary links to the current
document.
Is this still the right thing to do after our switch to CTS?
I see some terms with only the active clinical trials link and some with only the NCI Thesaurus link. Other than the exception listed in the previous comment shouldn't both links get created for a C-code that's published?
I am not sure why this was specified. Margaret may be able help answer this question.
That is right. I don't see a reason why they shouldn't have both links.
Meanwhile, is it possible to get a list of such terms?
Yes, that's possible. However, I can not identify a clear pattern of how to pick up the existing C-code from the term document. I see the C-code is sometimes entered as part of the OtherName with a SouceTermType = PT (primary type, I suppose), but there are also term documents where I find the C-code listed as part of the definition block and a primary type does not exist.
This doesn't appear to be a simple SQL query and could turn into a project itself. Please create a separate ticket for this cleanup report.
I made the filter changes and ran a diff report on QA. Many (all?) of the clinical trials link errors will be resolved by dropping the sentence _Check for active clinical trials ... _ or replacing the c-code with a valid code.
However, I see many links to the NCI Thesaurus that are
changing due to the existence of multiple C-codes for a single
term.
~oseipokuw, how should the
filter identify which C-code to use when creating the link?
See, for example, suramin (CDR40052) with the two C-codes: C853, C1848
Adding ~mbeckwit as a watcher to answer this question. I left the filter code as is for now, meaning the link to clinical trials will not be created for DIS terms.
The following filter has been updated to suppress creating the clinical trials link if the term has not been published yet.
CDR134: Vendor Filter: Term [cts-hotfix 5c3fd00d]
https://github.com/NCIOCPL/cdr-cts-hotfix/commit/5c3fd00d
I ran a diff report on QA and confirmed that I didn't break anything. There were two types of differences:
The broken links were removed as expected
Some of the links were created to a different C-code due to multiple existing C-codes for a single term (see comment above).
I published the changed filter to STAGE and PROD and ran a hot-fix
for one of the reported drugs with a broken link:
https://www.cancer.gov/publications/dictionaries/cancer-drug?CdrID=791102
All terminology documents will be published as part of the nightly publishing job.
The drug terms have been updated as part of last night’s publishing job. Those broken links reported should not be displayed anymore. ~oseipokuw is still investigating the terms with multiple C-codes for the NCIThesaurusConcept elsewhere.
Please verify on PROD and close this ticket.
For my information.
The branch cts-hotfix has been merged into master of
the cdr-server repository and has been deleted.
https://github.com/NCIOCPL/cdr-server/commit/3cb7610a
~oseipokuw, these
filter changes are on PROD since October.
Is there anything left to do or can we close this ticket?
We haven't completed fixing the multiple codes problem. That is why I have it open. If you don't mind, I'd like to keep it open until we are done.
I didn't know we were waiting on the content fix of the multiple
C-codes.
We can keep the ticket open if you prefer. Just close it once the
cleanup is completed.
~oseipokuw, I was asked why so many drug terms are missing the C-code?
As we discussed in comments above, the C-code appears in many different places in the term document. We had fixed the links to the NCIThesaurus as part of this ticket by picking up the C-code from the NCIThesaurusConcept element. Unfortunately, there are now may terms with a published C-code in the EVS that don't have a link because the NCIThesaurusConcept isn't available in the CDR term document (i.e. nutraceutical TBL-12 (CDR696523)).
Another problem I see is that we have some documents that include the
attribute NCIThesaurusConceptID as an attribute in the term
document which is populated by the
SourceInformation/VocabularySource ID which is currently
missing for 2000+ term documents.
Shouldn't this attribute be populated by the
NCIThesaurusConcept ID as well?
As we discussed in comments above, the C-code appears in many different places in the term document. We had fixed the links to the NCIThesaurus as part of this ticket by picking up the C-code from the NCIThesaurusConcept element. Unfortunately, there are now may terms with a published C-code in the EVS that don't have a link because the NCIThesaurusConcept isn't available in the CDR term document (i.e. nutraceutical TBL-12 (CDR696523)).
We used to do a global periodically to update the NCI Thesaurus IDs but we haven't done a global for several years now. Can we do what something similar to what we do for the glossary term links? That is, there is no C-code, then no link should be created.
Another problem I see is that we have some documents that include the attribute NCIThesaurusConceptID as an attribute in the term document which is populated by the SourceInformation/VocabularySource ID which is currently missing for 2000+ term documents.
Shouldn't this attribute be populated by the NCIThesaurusConcept ID as well?}
Are all the 2000+ terms drug terms? Can you please provide me with a few examples ?
Are all the 2000+ terms drug terms? Can you please provide me with a few examples ?
I believe those 2000+ are all drug terms. Here are a few of those but I can send you the entire list if needed.
37853|MVA-MUC1-IL2 vaccine
37880|enoxaparin sodium
37977|phenethyl isothiocyanate
38100|vincristine sulfate liposome
38202|DC-cholesterol liposome
38221|gavilimomab
Thanks! All these have the concept codes at the end of the documents which I believe is what is used to create the link. I also checked Cancer.gov for the first one MVA-MUC1-IL2 vaccine and it appears the links are working correctly. Can you please explain the problem with these ?
Yes, they have the correct concept code used for the link because the link get's created by using the element NCIThesaurusConcept but they don't include the corresponding Term attribute at all, so Cancer.gov doesn't know what the C-code is. The filter doesn't populate the attribute NCIThesaurusConceptID from the element NCIThesaurusConcept which, in my opinion, is wrong.
Please compare the output of these two filters and look at the Term
root element:
https://cdr.cancer.gov/cgi-bin/cdr/Filter.py?DocId=37880&DocVer=lastp&newdtd=pdqCG.dtd&filter=set%3AVendor%20Term%20Set
I see the difference. Would you modify the filter to add the Term attribute on the NCIThesaurusConcept element ?
Yes, if we decide that the
SourceInformation/VocabularySource/SourceTermId is not the
proper source for the NCIThesaurusConcept information we will
have to modify the filter.
However, a clean-up effort should probably go along with this to ensure
all of the terms listing the concept ID in the SourceTermId
field do include the /Term/NCIThesaurusConcept as well.
This ticket was left open until CIAT had fixed the multiple codes problem. According to ~oseipokuw, this is now completed and we can close this ticket.
Created a new ticket addressing the problem of the missing attribute with the C-code for the Term element.
Elapsed: 0:00:00.001337