Issue Number | 4156 |
---|---|
Summary | [DIS] Reevaluate Links Between DIS and DCS |
Created | 2016-09-22 13:27:39 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2016-12-09 11:27:33 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.194690 |
We need to discuss whether the links from DCS documents to DIS documents need to be entered using ExternalRef elements (as opposed to DrugSummaryRefs). Using DrugSummaryRef elements in place of the ExternalRefs would likely offer more flexibility in reports (such as the ability to show DCS/DIS relationships) and they would also be less likely to need manual updating (such as when URLs change). The schema appears to allow DrugSummaryRefs already, but the template (below) has ExternalRefs and this is what has been used.
This would involve modifications to the link control table (we can do this without CBIIT's assistance), and to the template (for which we do need to coordinate with CBIIT). Might want to have a global change.
Is further discussion needed, or should we proceed with the changes?
You can proceed with the changes. We should plan on a global change as part of this effort.
We should discuss how to enforce consistency moving forward, though. Would it be possible to introduce a validation error if an external ref is used for this purpose in the future? Or should the schema prevent it from being added? This piece may need to be a separate issue.
We'll change the schema (as part of this ticket's work), which should do what you want.
Running some assumptions by you.
I'll know which DIS docs are DCS docs by looking for DrugInfoType/@Combination ='Yes' in the DIS metadata
I'll find out which DIS doc to link to by looking up the external link cdr:xref to the URL cdr:ref in the query_term table
I'll avoid duplicates in that lookup by avoiding blocked DIS documents as link targets
I'll replace all ExternalRef elements in DCS docs, even those not inside Table blocks
I found one duplicate URL (http://www.cancer.gov/about-cancer/treatment/drugs/valrubicin) in both CDR729629 (blocked) and CDR505384 (active). The only two ExternalRef elements I found in DCS docs on DEV outside of Table blocks were in this block of CDR757219:
Para cdr:id="_10">
<
ExternalRef cdr:xref="http://www.cancer.gov/about-cancer/treatment/drugs/methotrexate">Methotrexate</ExternalRef>
<
and ExternalRef cdr:xref="http://www.cancer.gov/about-cancer/treatment/drugs/cytarabine">Cytarabine</ExternalRef>
<
are also given as part of this GlossaryTermRef cdr:href="CDR0000045650">combination</GlossaryTermRef>
<
.Para> </
Do these assumptions sound right?
These assumptions all sound reasonable, but I will run them past Diana tomorrow to confirm.
In the meantime, just wanted to summarize what we discussed in today's meeting.
1. We'll do a global to replace ExternalRefs in the DCS documents
with DrugSummaryRefs to the appropriate DIS.
2. We'll update the template for DCS to use DrugSummaryRef elements in
the table rather than ExternalRef elements. There won't be any schema
changes for this ticket.
3. ExternalRef elements in the DIS will be manually updated (there are a
few links to other DIS and one link to an outside resource that will
remain an ExternalRef).
Here's a test run from a global change on DEV:
https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2016-12-09_07-52-07
You might want to consider whether you'll need another ticket or two (CSS changes in XMetaL and filtering changes for publishing, etc.).
The template has been changed on DEV.
I checked in XMetaL and as far as I can tell the CSS appears to work fine just as it is. Still might need a ticket or two for filter changes. Added ~volker as a watcher so he can weigh in.
Hi Bob -
Just confirming that the assumptions above are all fine to make.
I've looked over the results from the test global and the only problem I see is that it appears to be stripping text surrounding the ExternalRefs, too. For example, in CDR757219, which, as you point out above, has ExternalRefs both inside and outside table elements, the following text is removed:
"(Adriamycin)" in the table. (FYI - this convention is used a lot
where the brand name is provided in parentheses after the drug name in
the table.)
"and" and "are also given as part of this" in the text beneath the
table.
Is it possible to keep that text? Thanks.
How's this?
https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2016-12-09_11-24-07
Looks good! Could you please run this in live mode on DEV and then I'll have Diana take a look, too?
Live mode run completed on DEV with no errors.
This looks good on DEV.
I've noticed a minor display issue in the Structure-View which is not directly related to this ticket but since Bob added me as a watcher I watched, saw, and fixed:
R14408: DrugInformationSummary_Structure.css
The Comments element displayed the '&' in the structured view.
Bob, could you please run this global in test mode on QA?
Running now: http://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2017-01-05_12-53-26
Test run complete. From the logs:
...
2017-01-05 12:55:00: Processing CDR0000762155 [pub:10/last:10/cwd:10]
2017-01-05 12:55:00: CDR0000762155: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/bortezomib
2017-01-05 12:55:01: Doc CDR0000762155 made invalid by change - will NOT store it
Current working document would become invalid
New pub version would become invalid
New last version would become invalid:href in element DrugSummaryRef
Missing required attribute cdr...
2017-01-05 12:55:22: Processing CDR0000781712 [pub:11/last:11/cwd:11]
2017-01-05 12:55:23: CDR0000781712: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/busulfan
2017-01-05 12:55:24: Doc CDR0000781712 made invalid by change - will NOT store it
Current working document would become invalid
New pub version would become invalid
New last version would become invalid:href in element DrugSummaryRef
Missing required attribute cdr...
2017-01-05 12:55:31: Processing CDR0000784806 [pub:4/last:4/cwd:4]
2017-01-05 12:55:32: CDR0000784806: cannot find https://www.cancer.gov/about-cancer/treatment/drugs/carboplatin
2017-01-05 12:55:32: CDR0000784806: cannot find https://www.cancer.gov/about-cancer/treatment/drugs/etoposidephosphate
2017-01-05 12:55:32: CDR0000784806: cannot find https://www.cancer.gov/about-cancer/treatment/drugs/vincristinesulfate
2017-01-05 12:55:33: Doc CDR0000784806 made invalid by change - will NOT store it
Current working document would become invalid
New pub version would become invalid
New last version would become invalid:href in element DrugSummaryRef (3 times)
Missing required attribute cdr2017-01-05 12:55:33: Run completed.
Docs examined = 54
= 162
Versions changed Time (hh:mm:ss) = 00:02:06
It appears that each of these failures are caused by a mismatch between http and https URLs. I've manually edited the URLs in the affected documents to use the "https" URL. Could you please try the test global once more on QA? Thanks.
Lots more failures now. For example:
CDR0000783452: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/carboplatin
CDR0000783452: cannot find http://www.cancer.gov/about-cancer/treatment/drugs/etoposidephosphate
Maybe we went in the wrong direction? I corrected (so I thought) the URLs to include an "s" (https://), but other docs must have pointed to the old URL (http://...). Spot-checking a few other DIS, it looks like the URLs are in the old format (http://). I can switch them back.
Would it help if I accepted a match with either protocol? Checking twice, in other words?
Yes, that would help, if it isn't too much hassle. Thanks.
That did the trick.
https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2017-01-06_18-30-38
Great. Test data look good - could you please do a live run on QA?
Done.
Both the DCS template and the global look good.
Verified on QA.
The global change has run in live mode on PROD.
Elapsed: 0:00:00.001335