CDR Tickets

Issue Number 4156
Summary [DIS] Reevaluate Links Between DIS and DCS
Created 2016-09-22 13:27:39
Issue Type Improvement
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2016-12-09 11:27:33
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.194690
Description

We need to discuss whether the links from DCS documents to DIS documents need to be entered using ExternalRef elements (as opposed to DrugSummaryRefs). Using DrugSummaryRef elements in place of the ExternalRefs would likely offer more flexibility in reports (such as the ability to show DCS/DIS relationships) and they would also be less likely to need manual updating (such as when URLs change). The schema appears to allow DrugSummaryRefs already, but the template (below) has ExternalRefs and this is what has been used.

Comment entered 2016-09-29 13:51:52 by Kline, Bob (NIH/NCI) [C]

This would involve modifications to the link control table (we can do this without CBIIT's assistance), and to the template (for which we do need to coordinate with CBIIT). Might want to have a global change.

Comment entered 2016-12-03 06:10:06 by Kline, Bob (NIH/NCI) [C]

Is further discussion needed, or should we proceed with the changes?

Comment entered 2016-12-05 11:22:36 by Juthe, Robin (NIH/NCI) [E]

You can proceed with the changes. We should plan on a global change as part of this effort.

We should discuss how to enforce consistency moving forward, though. Would it be possible to introduce a validation error if an external ref is used for this purpose in the future? Or should the schema prevent it from being added? This piece may need to be a separate issue.

Comment entered 2016-12-05 12:46:05 by Kline, Bob (NIH/NCI) [C]

We'll change the schema (as part of this ticket's work), which should do what you want.

Comment entered 2016-12-08 17:31:48 by Kline, Bob (NIH/NCI) [C]

Running some assumptions by you.

  1. I'll know which DIS docs are DCS docs by looking for DrugInfoType/@Combination ='Yes' in the DIS metadata

  2. I'll find out which DIS doc to link to by looking up the external link cdr:xref to the URL cdr:ref in the query_term table

  3. I'll avoid duplicates in that lookup by avoiding blocked DIS documents as link targets

  4. I'll replace all ExternalRef elements in DCS docs, even those not inside Table blocks

I found one duplicate URL (http://www.cancer.gov/about-cancer/treatment/drugs/valrubicin) in both CDR729629 (blocked) and CDR505384 (active). The only two ExternalRef elements I found in DCS docs on DEV outside of Table blocks were in this block of CDR757219:

<Para cdr:id="_10">

<ExternalRef cdr:xref="http://www.cancer.gov/about-cancer/treatment/drugs/methotrexate">Methotrexate</ExternalRef>
 and 
<ExternalRef cdr:xref="http://www.cancer.gov/about-cancer/treatment/drugs/cytarabine">Cytarabine</ExternalRef>
 are also given as part of this 
<GlossaryTermRef cdr:href="CDR0000045650">combination</GlossaryTermRef>
.
</Para>

Do these assumptions sound right?

Comment entered 2016-12-08 17:48:36 by Juthe, Robin (NIH/NCI) [E]

These assumptions all sound reasonable, but I will run them past Diana tomorrow to confirm.

In the meantime, just wanted to summarize what we discussed in today's meeting.

1. We'll do a global to replace ExternalRefs in the DCS documents with DrugSummaryRefs to the appropriate DIS.
2. We'll update the template for DCS to use DrugSummaryRef elements in the table rather than ExternalRef elements. There won't be any schema changes for this ticket.
3. ExternalRef elements in the DIS will be manually updated (there are a few links to other DIS and one link to an outside resource that will remain an ExternalRef).

Comment entered 2016-12-09 07:59:26 by Kline, Bob (NIH/NCI) [C]

Here's a test run from a global change on DEV:

https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2016-12-09_07-52-07

You might want to consider whether you'll need another ticket or two (CSS changes in XMetaL and filtering changes for publishing, etc.).

Comment entered 2016-12-09 07:59:50 by Kline, Bob (NIH/NCI) [C]

The template has been changed on DEV.

Comment entered 2016-12-09 08:03:30 by Kline, Bob (NIH/NCI) [C]

I checked in XMetaL and as far as I can tell the CSS appears to work fine just as it is. Still might need a ticket or two for filter changes. Added as a watcher so he can weigh in.

Comment entered 2016-12-09 10:33:15 by Juthe, Robin (NIH/NCI) [E]

Hi Bob -

Just confirming that the assumptions above are all fine to make.

I've looked over the results from the test global and the only problem I see is that it appears to be stripping text surrounding the ExternalRefs, too. For example, in CDR757219, which, as you point out above, has ExternalRefs both inside and outside table elements, the following text is removed:

"(Adriamycin)" in the table. (FYI - this convention is used a lot where the brand name is provided in parentheses after the drug name in the table.)
"and" and "are also given as part of this" in the text beneath the table.

Is it possible to keep that text? Thanks.

Comment entered 2016-12-09 11:27:33 by Kline, Bob (NIH/NCI) [C]
Comment entered 2016-12-09 11:43:26 by Juthe, Robin (NIH/NCI) [E]

Looks good! Could you please run this in live mode on DEV and then I'll have Diana take a look, too?

Comment entered 2016-12-09 11:51:13 by Kline, Bob (NIH/NCI) [C]

Live mode run completed on DEV with no errors.

Comment entered 2016-12-14 09:27:27 by Juthe, Robin (NIH/NCI) [E]

This looks good on DEV.

Comment entered 2016-12-19 18:06:01 by Englisch, Volker (NIH/NCI) [C]

I've noticed a minor display issue in the Structure-View which is not directly related to this ticket but since Bob added me as a watcher I watched, saw, and fixed:

  • R14408: DrugInformationSummary_Structure.css

The Comments element displayed the '&' in the structured view.

Comment entered 2017-01-05 12:09:14 by Juthe, Robin (NIH/NCI) [E]

Bob, could you please run this global in test mode on QA?

Comment entered 2017-01-05 12:53:49 by Kline, Bob (NIH/NCI) [C]
Comment entered 2017-01-05 13:02:22 by Kline, Bob (NIH/NCI) [C]

Test run complete. From the logs:

...
2017-01-05 12:55:00: Processing CDR0000762155 [pub:10/last:10/cwd:10]
2017-01-05 12:55:00: CDR0000762155: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/bortezomib
2017-01-05 12:55:01: Doc CDR0000762155 made invalid by change - will NOT store it
Current working document would become invalid
New pub version would become invalid
New last version would become invalid
Missing required attribute cdr:href in element DrugSummaryRef
...
2017-01-05 12:55:22: Processing CDR0000781712 [pub:11/last:11/cwd:11]
2017-01-05 12:55:23: CDR0000781712: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/busulfan
2017-01-05 12:55:24: Doc CDR0000781712 made invalid by change - will NOT store it
Current working document would become invalid
New pub version would become invalid
New last version would become invalid
Missing required attribute cdr:href in element DrugSummaryRef
...
2017-01-05 12:55:31: Processing CDR0000784806 [pub:4/last:4/cwd:4]
2017-01-05 12:55:32: CDR0000784806: cannot find https://www.cancer.gov/about-cancer/treatment/drugs/carboplatin
2017-01-05 12:55:32: CDR0000784806: cannot find https://www.cancer.gov/about-cancer/treatment/drugs/etoposidephosphate
2017-01-05 12:55:32: CDR0000784806: cannot find https://www.cancer.gov/about-cancer/treatment/drugs/vincristinesulfate
2017-01-05 12:55:33: Doc CDR0000784806 made invalid by change - will NOT store it
Current working document would become invalid
New pub version would become invalid
New last version would become invalid
Missing required attribute cdr:href in element DrugSummaryRef (3 times)
2017-01-05 12:55:33: Run completed.
   Docs examined    = 54
   Versions changed = 162
   Time (hh:mm:ss)  = 00:02:06
Comment entered 2017-01-06 11:24:49 by Juthe, Robin (NIH/NCI) [E]

It appears that each of these failures are caused by a mismatch between http and https URLs. I've manually edited the URLs in the affected documents to use the "https" URL. Could you please try the test global once more on QA? Thanks.

Comment entered 2017-01-06 16:51:30 by Kline, Bob (NIH/NCI) [C]

Lots more failures now. For example:

CDR0000783452: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/carboplatin
CDR0000783452: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/etoposidephosphate

Comment entered 2017-01-06 16:58:32 by Juthe, Robin (NIH/NCI) [E]

Maybe we went in the wrong direction? I corrected (so I thought) the URLs to include an "s" (https://), but other docs must have pointed to the old URL (http://...). Spot-checking a few other DIS, it looks like the URLs are in the old format (http://). I can switch them back.

Comment entered 2017-01-06 17:19:25 by Kline, Bob (NIH/NCI) [C]

Would it help if I accepted a match with either protocol? Checking twice, in other words?

Comment entered 2017-01-06 17:50:34 by Juthe, Robin (NIH/NCI) [E]

Yes, that would help, if it isn't too much hassle. Thanks.

Comment entered 2017-01-06 19:13:02 by Kline, Bob (NIH/NCI) [C]
Comment entered 2017-01-10 15:05:12 by Juthe, Robin (NIH/NCI) [E]

Great. Test data look good - could you please do a live run on QA?

Comment entered 2017-01-11 10:30:07 by Kline, Bob (NIH/NCI) [C]

Done.

Comment entered 2017-01-11 16:03:43 by Juthe, Robin (NIH/NCI) [E]

Both the DCS template and the global look good.

Verified on QA.

Comment entered 2017-03-17 23:06:57 by Kline, Bob (NIH/NCI) [C]

The global change has run in live mode on PROD.

Elapsed: 0:00:00.001335