Issue Number | 5331 |
---|---|
Summary | Update Clinical Trials URL |
Created | 2024-08-15 14:34:16 |
Issue Type | Task |
Submitted By | Shields, Victoria (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2024-09-23 11:51:05 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.457905 |
The URL of the Clinical Trials Information for Patients and Caregivers landing page changed in September 2023. In November 2023, the English miscellaneous documents were republished with the new URL. However, 74 English summaries have the old link embedded in the following sentence: Information about clinical trials is available from the NCI website. We would like to programmatically update the URL in the summaries with this sentence. The new URL is https://www.cancer.gov/research/participate/clinical-trials. The standard wording report with the affected English summaries is attached.
Changes also need to be made for the Spanish documents and Linda will add that information to this ticket. Bob determined that one ticket can be used for both English and Spanish.
Bob, Volker, and I discussed this at the 8/15/2024 CDR meeting and decided that directs for subURLs will be fixed at a later time with a separate ticket.
Hi Bob,
The URL of the Spanish Clinical Trials information for Patients and Caregivers (Información sobre estudios clínicos para pacientes y cuidadores) page in 46 HP Spanish summaries and in two Misc. Spanish docs was updated and published last week. However, a list of 110 patient summaries still have the old link embedded in the following sentences (marked in red in the attached standard wording report):
Para obtener más información, consulte Información sobre estudios clínicos para pacientes y cuidadores.
Para obtener más información sobre ensayos clínicos, consulte el portal de Internet del NCI.
La información sobre ensayos clínicos está disponible en el portal de Internet del NCI.
Para obtener información sobre ensayos clínicos, consulte el portal de Internet del NCI.
The new Spanish URL is https://www.cancer.gov/espanol/investigacion/participe/estudios-clinicos.
Additionally, the SourceTitle for this page will need to be changed in all summaries to "Información sobre estudios clínicos para pacientes y cuidadores".
Please let me know if you have any questions.
Thank you,
Linda
Aside from the fact that I don't see anything marked in red in the
Spanish standard wording report, selecting documents to be processed by
a global change script by parsing the text of a word-processing document
is not a reliable approach. We should identify what needs to be changed
by looking for ExternalRef
elements in Summary
documents which have specific values in the cdr:xref
attribute. This means you will need to tell me what the old URL is which
needs to be replaced. Also, if you left the SourceTitle
attribute unchanged when you replaced the URL in some of the summaries,
we'll also need to look for elements which already have the new URL in
the cdr:xref
attribute to change the
SourceTitle
for those elements as well.
Can someone confirm that we're just changing SourceTitle
values in Spanish summaries, not the English summaries?
Hi ~bkline , I attached another version just in case you'd like to see where the links to the clinical trial appear.
Thank you for your explanation and specifying the information you need.
Also found a couple Spanish patient summaries that by mistake are linking to the old English URL: https://www.cancer.gov/about-cancer/treatment/clinical-trials
I changed all the SourceTitles when I replaced the URLs last week in the HP summaries and the 2 Spanish Misc. docs.
Thank you,
Linda
Again, I want to make absolutely sure that you are not expecting that the global change will make any use of any information in any of the Microsoft Word documents attached to this ticket. Even if is possible to create software which would 100% reliably be able to correctly parse all the possible representations of information in a word-processing document (and I'm not sure that's possible), it would be prohibitively expensive to develop and test such software. Simply put, spreadsheets are inherently machine-parsable, and word-processing documents are not. Word-processing documents are intended for rendering information to be consumed by human readers, not by machines.
For this task, we need neither spreadsheets nor word-processing documents. All we need are rules for identifying which elements are to be modified, and what that modification will do. Here are the rules I propose.
FOR EACH CDR SUMMARY DOCUMENT:
FOR EACH ExternalRef ELEMENT:
IF THE DOCUMENT IS A SPANISH SUMMARY:
IF THE VALUE OF THE cdr:xref ATTRIBUTE HAS THE OLD ENGLISH OR SPANISH URL VALUE:
REPLACE THAT ATTRIBUTE'S VALUE WITH THE NEW SPANISH URL
SET THE SourceTitle ATTRIBUTE'S VALUE TO THE NEW SPANISH SOURCE TITLE
OTHERWISE:
IF THE VALUE OF THE cdr:xref ATTRIBUTE HAS THE OLD ENGLISH URL VALUE:
REPLACE THAT ATTRIBUTE'S VALUE WITH THE NEW ENGLISH URL
Please review and confirm that these rules correctly reflect the requirements.
Hi ~bkline , Thank you for the explanation. I understand. My apologies if the Word doc caused any confusion. I wanted to give you and idea of how many Spanish summaries need to be updated and where the links appear in case it was helpful.
The rule you propose for the Spanish summaries looks good.
Thank you,
Linda
Logs from the test run on DEV:
https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2024-08-05_07-58-27
Edit: no, sorry, that was from an earlier global change. By mistake I ran this on my DEV VM. Same database and script, but different log location. Here's the real logs for this test job:
https://nciws-d2019-v.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2024-08-28_11-26-17
Please change/add "Clinical Trials Information for Patients and
Caregivers" to the English summaries
{}SourceTitle values{
}. Thanks!
As discussed in the CDR/EBMS today, I am clarifying that the text above should be added to the English summaries as their SourceTitle value. If a SourceTitle value is already populated with a different value, it should be changed to the above text.
Global change script modified to incorporate new requirements.
https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2024-09-06_11-21-27
The new URL for the Spanish summaries appears to end with a period that is not needed . Please remove it and rerun in test mode on DEV. Thanks!
"https://www.cancer.gov/espanol/investigacion/participe/estudios-clinicos."
Looks good. Please run in live mode on DEV. Thanks!
Done.
Looks good on DEV. Please run in test mode on QA. Thanks!
Verified. Please run in live mode on QA. Thanks!
Done.
Looks good on QA. Please run in test mode on PROD.
Verified. Please run in live mode on PROD. Thanks!
Done.
Verified on PROD. Thanks!
File Name | Posted | User |
---|---|---|
ClinicalTrialURLLinks_SpanishPatSummaries.docx | 2024-08-20 14:05:19 | Saucedo, Linda (NIH/NCI) [C] |
ClinicalTrialURLLinks_SpanishPatSummaries-1.docx | 2024-08-21 08:52:04 | Saucedo, Linda (NIH/NCI) [C] |
Standard Wording Report_CT Information.docx | 2024-08-15 14:34:02 | Shields, Victoria (NIH/NCI) [E] |
Elapsed: 0:00:00.001794