CDR Tickets

Issue Number 4691
Summary [DIS] Spreadsheet to compare DIS, Glossary, and Drug Dictionary
Created 2019-10-30 16:05:27
Issue Type Task
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2020-01-30 18:35:26
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.251855
Description

As discussed in a previous meeting, we would like to have a spreadsheet generated that contains the following information in four columns:

 

CDR ID (of the DIS/DCS)

Document Name (of the DIS/DCS)

Glossary Definition (text from the associated GTC for the term name linked in the DIS/DCS)

Drug Definition (text from the associated term record for the linked term in the DIS. This will be blank for DCS)

 

This will be used to inform the requested values for the drug class ticket (OCECDR-4690) in Leibniz.

Comment entered 2020-01-30 16:41:26 by Kline, Bob (NIH/NCI) [C]

Looks like each DrugInformationSummary has only one link to a Term document, but many links to GlossaryTermName documents. Can you confirm that you're expecting multiple rows for at least some of the DIS docs, one for each linked glossary term name?

Comment entered 2020-01-30 18:35:26 by Kline, Bob (NIH/NCI) [C]

I'm going to guess that you don't want multiple rows for each DIS/DCS document, and that what you had in mind for the glossary links were just the ones in the

DrugInfoMetaData

block. Let me know if that's not right.

Comment entered 2020-01-30 19:03:40 by Juthe, Robin (NIH/NCI) [E]

Yes, you're right! Just the GlossaryLink that's in the metadata block. Sorry I didn't specify that.

 

This looks great, but I noticed a couple of anomalies that I thought I'd mention:

  • There's some strange formatting in the dictionary definition column. Some of the definitions have large spaces or line breaks within the definition. Is there an easy trick to clean that up? If not, I think we can live with this as it's for internal use.

  • The dictionary definition for selinexnor isn't showing up (corresponding with the following DIS- CDR 798722). We can copy/paste this in - it seems to be the only publishable DIS with a blank cell - but it seemed odd so I figured I'd mention it. This was run using the data on PROD right? I double-checked that the term had a publishable definition.

Thank you!

Comment entered 2020-01-30 22:02:05 by Kline, Bob (NIH/NCI) [C]

I have preserved the whitespace exactly as it was entered by the users. I'll take a look at 798722.

Comment entered 2020-01-30 22:26:20 by Kline, Bob (NIH/NCI) [C]

I meant to run it on PROD, but the first one was from DEV. I've added the version for PROD, and that's got the missing definition.

As you can see, the report is reflecting the way the definition was entered in the CDR:

 

Comment entered 2020-01-31 09:07:48 by Kline, Bob (NIH/NCI) [C]

Would you prefer that I normalize whitespace in the definitions, rather than preserve what the users have entered?

Comment entered 2020-01-31 09:39:31 by Kline, Bob (NIH/NCI) [C]

I've done it both ways, so you can take your pick. 😃

Comment entered 2020-01-31 09:45:18 by Juthe, Robin (NIH/NCI) [E]

This looks great, Bob. Thank you!!

Attachments
File Name Posted User
image-2020-01-30-22-24-54-526.png 2020-01-30 22:25:20 Kline, Bob (NIH/NCI) [C]
ocecdr-4691.xlsx 2020-01-30 18:32:36 Kline, Bob (NIH/NCI) [C]
ocecdr-4691-prod.xlsx 2020-01-30 22:23:54 Kline, Bob (NIH/NCI) [C]
ocecdr-4691-prod-normalized-20200131092323.xlsx 2020-01-31 09:38:56 Kline, Bob (NIH/NCI) [C]

Elapsed: 0:00:00.001278