CDR Tickets

Issue Number 4916
Summary Create Ad-hoc Query for Drug Terms
Created 2020-11-13 17:12:26
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2020-11-13 19:48:19
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.278607
Description

Could  you please generate an ad hoc query for all drug terms with the following fields:

  • Preferred name

  • Definition

  • CDR ID

  • NCI Thesaurus ID

This should be for all drug terms that have currently been published to Cancer.gov. If it is possible, on a second tab, would you include all drug terms that are in the CDR but have not been published to Cancer.gov and their status (blocked or not). Also, please indicate which ones have been made publishable but have not bee published to Cancer.gov yet.  I know a second tab might not be possible so a second query should be fine.

Thank you!

William

Comment entered 2020-11-13 17:14:08 by Englisch, Volker (NIH/NCI) [C]

Names for the ad-hoc queries:

  • Drug Terms - published to Drug Dictionary

  • Drug Terms  - NOT published

Comment entered 2020-11-13 17:16:16 by Englisch, Volker (NIH/NCI) [C]

The following path has to be added to the list of query terms.  All term documents will need to be re-indexed on all tiers:

/Term/Definition/DefinitionText 

Comment entered 2020-11-13 19:46:25 by Englisch, Volker (NIH/NCI) [C]

Hi Volker,

Would you be able to modify the query to include Other Names in a column separated by semi colons?

Thanks,

William

Comment entered 2020-11-13 19:48:10 by Englisch, Volker (NIH/NCI) [C]

The queries have been created on all tiers.  The drug terms have been reindexed.

Please check out the 2 new queries.

Comment entered 2020-11-16 09:19:15 by Osei-Poku, William (NIH/NCI) [C]

Thanks . When I run the first query Drug terms on published to Cancer.gov,  it retrieves 7261 however, on Cancer.gov the total number appears to be 7845. I am not sure where the discrepancy is coming from but it appears to be significant.

Comment entered 2020-11-16 13:43:59 by Englisch, Volker (NIH/NCI) [C]

I see that Cancer.gov lists brand names as a separate entry, therefore inflating the numbers we list since it would count a drug multiple times, once for the preferred name and once for the brand name.

I'm still looking at the possibility that some terms aren't listing the C-code properly in the field for the NCIThesaurusConcept.  A term without this element would be dropped on the list.

Comment entered 2020-11-17 16:28:45 by Englisch, Volker (NIH/NCI) [C]

I looked through the output for one letter (Z) to compare and find any differences between the Cancer.gov list and the ad-hoc query output:

  • The total number of drugs listed on Cancer.gov is 113

  • 55 of those are brand names and need to be excluded: Total on Cancer.gov - 58

  • 2 terms are missing on Cancer.gov because of an incorrect SemanticType (this is a bug in our export software)
    Total on Cancer.gov: 60

  •  

  • Ad-hoc query lists 59

  • 1 term is excluded from Cancer.gov because of a missing definition (missing ReviewStatus)
    Total ad-hoc: 58

  • 2 terms are excluded from ad-hoc query because of missing C-code
    Total ad-hoc: 60

It appears the numbers are matching between both lists when the missing data elements are considered.

These are the terms with issues:

  • 803278 - zelenoleucel: SemanticType

  • 803287 - zirconium Zr 89-DFO-fianlimab: SemanticType

  • 764414 - zirconium Zr 89-girentuximab: missing ReviewStatus

  • zirconium Zr 89-desferrioxamine B monoclonal antibody huJ591: missing C-code

  • zirconium Zr 89-labeled anti-PIGF monoclonal antibody RO5323441: missing C-code

  • zirconium Zr 89˗DFO˗REGN3504
    For this term the dash ('-') character is not a dash and is displayed as question mark in the ad-hoc query.

Comment entered 2020-11-18 15:10:21 by Osei-Poku, William (NIH/NCI) [C]

Could you please include only terms with a Drug/Agent semantic type. Currently the spreadsheet include terms that are not drug terms. Like:

37766

stage 0 chronic lymphocytic leukemia

37769

adult lymphoblastic lymphoma

 **

37790

Waldenström macroglobulinemia

Comment entered 2020-11-18 17:37:37 by Englisch, Volker (NIH/NCI) [C]

The query has been updated on PROD.

Comment entered 2020-11-19 10:56:05 by Osei-Poku, William (NIH/NCI) [C]

The errors have been fixed. Thank  you!

Comment entered 2020-11-19 10:56:21 by Osei-Poku, William (NIH/NCI) [C]

Looks good on PROD. Thank you!

Comment entered 2020-11-19 21:31:55 by Osei-Poku, William (NIH/NCI) [C]

Adding offline conversation with about this ticket. 

 

 

 

From: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Sent: Thursday, November 19, 2020 5:14 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: Re: CDR oddity this week

 

In that case, why don’t we remove the definition from the query but keep the query with a comment and delete the query term?

Deleting the query_term will prevent the warning message to pop up with every save and the comment should remind us what to do the next time we need to run this query again if the definition needs to be included.

 

Plus, we should add a comment to the ticket.

 

Thanks,

 

        Volker

Volker Englisch
NCI OCPL – Office of Communications & Public Liaison

Contractor: publicis sapient
NCI: 240-276-6583

 

 

From: "Osei-Poku, William" <William.Osei-Poku@icf.com>
Date: Thursday, November 19, 2020 at 3:48 PM
To: Volker Englisch <volker@mail.nih.gov>
Subject: RE: CDR oddity this week

 

  1. We can remove it from the query terms and the report for now as we’ve already generated a spreadsheet to send to EVS. However, I am sure at some point, they will ask for another one so we may have to repeat this whole thing again.

 

Thanks,

William

 

 

From: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Sent: Thursday, November 19, 2020 3:46 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: Re: CDR oddity this week

 

I think the only way to avoid this would be to remove the DrugDefinitionText  from the query terms and remove it from the new ad-hoc report or create the report via a Python script.

 

It’s just a notification and therefore not really something to be concerned about but I can see how  it might get annoying.  

 

Thanks,

 

        Volker

Volker Englisch
NCI OCPL – Office of Communications & Public Liaison

Contractor: publicis sapient
NCI: 240-276-6583

 

 

From: "Osei-Poku, William" <William.Osei-Poku@icf.com>
Date: Thursday, November 19, 2020 at 3:22 PM
To: Volker Englisch <volker@mail.nih.gov>
Subject: RE: CDR oddity this week

 

Should we just ignore it ?

Thanks,

William

 

From: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Sent: Thursday, November 19, 2020 2:57 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: Re: CDR oddity this week

 

I don’t think it’s a limit on drug definitions but on indexed terms.  I had to add the drug definitions to the query_term index table for your ad-hoc report.  That’s likely why you’re now seeing this because the query_term table gets updated with every save.

 

Thanks,

 

        Volker

Volker Englisch
NCI OCPL – Office of Communications & Public Liaison

Contractor: publicis sapient
NCI: 240-276-6583

 

 

From: "Osei-Poku, William" <William.Osei-Poku@icf.com>
Date: Thursday, November 19, 2020 at 2:46 PM
To: Volker Englisch <volker@mail.nih.gov>
Subject: FW: CDR oddity this week

 

Hi Volker,

Are you aware that there is an 800 character limit on Drug definitions ?

Thanks,

William

 

 

From: Barnstead, Mary (NIH/NCI) [C] <mary.barnstead@nih.gov>
Sent: Thursday, November 19, 2020 2:44 PM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: RE: CDR oddity this week

 

Hi William,

 

I first started seeing it yesterday.

 

Thanks

Mary

 

From: Osei-Poku, William <William.Osei-Poku@icf.com>
Sent: Thursday, November 19, 2020 2:27 PM
To: Barnstead, Mary (NIH/NCI) [C] <mary.barnstead@nih.gov>
Subject: RE: CDR oddity this week

 

Hi Mary,

Is this the first time you’re seeing this or you’ve seen it before.

Thanks,

William

 

 

From: Barnstead, Mary (NIH/NCI) [C] <mary.barnstead@nih.gov>
Sent: Thursday, November 19, 2020 8:53 AM
To: Osei-Poku, William <William.Osei-Poku@icf.com>
Subject: CDR oddity this week

 

Hi William,

 

In term records, I’ve started getting this message when there is a long definition:

 

Is this anything to worry about?

 

Thanks!

Mary

______________________________________________________________________________________

Mary Barnstead, MS PMP

CIAT Terminology and Drug Information Manager – NCI/ICF

(301) 407-6640 (office) (240) 449-9762 (cell)

Elapsed: 0:00:00.001897