CDR Tickets

Issue Number 5331
Summary Update Clinical Trials URL
Created 2024-08-15 14:34:16
Issue Type Task
Submitted By Shields, Victoria (NIH/NCI) [E]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2024-09-23 11:51:05
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.457905
Description

The URL of the Clinical Trials Information for Patients and Caregivers landing page changed in September 2023. In November 2023, the English miscellaneous documents were republished with the new URL. However, 74 English summaries have the old link embedded in the following sentence: Information about clinical trials is available from the NCI website. We would like to programmatically update the URL in the summaries with this sentence. The new URL is https://www.cancer.gov/research/participate/clinical-trials. The standard wording report with the affected English summaries is attached.

 

Changes also need to be made for the Spanish documents and Linda will add that information to this ticket. Bob determined that one ticket can be used for both English and Spanish.

 

Bob, Volker, and I discussed this at the 8/15/2024 CDR meeting and decided that directs for subURLs will be fixed at a later time with a separate ticket.

Comment entered 2024-08-20 14:04:52 by Saucedo, Linda (NIH/NCI) [C]

Hi Bob,

The URL of the Spanish Clinical Trials information for Patients and Caregivers (Información sobre estudios clínicos para pacientes y cuidadores) page in 46 HP Spanish summaries and in two Misc. Spanish docs was updated and published last week. However, a list of 110 patient summaries still have the old link embedded in the following sentences (marked in red in the attached standard wording report):

  • Para obtener más información, consulte Información sobre estudios clínicos para pacientes y cuidadores.

  • Para obtener más información sobre ensayos clínicos, consulte el portal de Internet del NCI.

  • La información sobre ensayos clínicos está disponible en el portal de Internet del NCI.

  • Para obtener información sobre ensayos clínicos, consulte el portal de Internet del NCI.

The new Spanish URL is https://www.cancer.gov/espanol/investigacion/participe/estudios-clinicos.

Additionally, the SourceTitle for this page will need to be changed in all summaries to "Información sobre estudios clínicos para pacientes y cuidadores".

Please let me know if you have any questions.

Thank you,

Linda

Comment entered 2024-08-20 15:05:33 by Kline, Bob (NIH/NCI) [C]

Aside from the fact that I don't see anything marked in red in the Spanish standard wording report, selecting documents to be processed by a global change script by parsing the text of a word-processing document is not a reliable approach. We should identify what needs to be changed by looking for ExternalRef elements in Summary documents which have specific values in the cdr:xref attribute. This means you will need to tell me what the old URL is which needs to be replaced. Also, if you left the SourceTitle attribute unchanged when you replaced the URL in some of the summaries, we'll also need to look for elements which already have the new URL in the cdr:xref attribute to change the SourceTitle for those elements as well.

Comment entered 2024-08-20 15:10:30 by Kline, Bob (NIH/NCI) [C]

Can someone confirm that we're just changing SourceTitle values in Spanish summaries, not the English summaries?

Comment entered 2024-08-21 09:09:43 by Saucedo, Linda (NIH/NCI) [C]

Hi ,  I attached another version just in case you'd like to see where the links to the clinical trial appear. 

Thank you for your explanation and specifying the information you need.

Also found a couple Spanish patient summaries that by mistake are linking to the old English URL: https://www.cancer.gov/about-cancer/treatment/clinical-trials 

 

I changed all the SourceTitles when I replaced the URLs last week in the HP summaries and the 2 Spanish Misc. docs.

Thank you,

Linda

Comment entered 2024-08-21 09:58:57 by Kline, Bob (NIH/NCI) [C]

Again, I want to make absolutely sure that you are not expecting that the global change will make any use of any information in any of the Microsoft Word documents attached to this ticket. Even if is possible to create software which would 100% reliably be able to correctly parse all the possible representations of information in a word-processing document (and I'm not sure that's possible), it would be prohibitively expensive to develop and test such software. Simply put, spreadsheets are inherently machine-parsable, and word-processing documents are not. Word-processing documents are intended for rendering information to be consumed by human readers, not by machines.

For this task, we need neither spreadsheets nor word-processing documents. All we need are rules for identifying which elements are to be modified, and what that modification will do. Here are the rules I propose.

FOR EACH CDR SUMMARY DOCUMENT:
  FOR EACH ExternalRef ELEMENT:
    IF THE DOCUMENT IS A SPANISH SUMMARY:
      IF THE VALUE OF THE cdr:xref ATTRIBUTE HAS THE OLD ENGLISH OR SPANISH URL VALUE:
        REPLACE THAT ATTRIBUTE'S VALUE WITH THE NEW SPANISH URL
        SET THE SourceTitle ATTRIBUTE'S VALUE TO THE NEW SPANISH SOURCE TITLE
    OTHERWISE:
      IF THE VALUE OF THE cdr:xref ATTRIBUTE HAS THE OLD ENGLISH URL VALUE:
        REPLACE THAT ATTRIBUTE'S VALUE WITH THE NEW ENGLISH URL

Please review and confirm that these rules correctly reflect the requirements.

Comment entered 2024-08-21 10:20:19 by Saucedo, Linda (NIH/NCI) [C]

Hi , Thank you for the explanation.  I understand. My apologies if the Word doc caused any confusion.  I wanted to give you and idea of how many Spanish summaries need to be updated and where the links appear in case it was helpful.

The rule you propose for the Spanish summaries looks good.  

Thank you,

Linda

Comment entered 2024-08-28 13:39:32 by Kline, Bob (NIH/NCI) [C]

Logs from the test run on DEV:

https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2024-08-05_07-58-27

Edit: no, sorry, that was from an earlier global change. By mistake I ran this on my DEV VM. Same database and script, but different log location. Here's the real logs for this test job:

https://nciws-d2019-v.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2024-08-28_11-26-17

Comment entered 2024-08-29 10:51:13 by Osei-Poku, William (NIH/NCI) [C]

Please change/add "Clinical Trials Information for Patients and Caregivers" to the English summaries {}SourceTitle values{}. Thanks!

Comment entered 2024-09-05 16:34:17 by Osei-Poku, William (NIH/NCI) [C]

As discussed in the CDR/EBMS today, I am clarifying that the text above should be added to the English summaries as their SourceTitle value. If a SourceTitle value is already populated with a different value, it should be changed to the above text.

Comment entered 2024-09-06 12:23:12 by Kline, Bob (NIH/NCI) [C]

Global change script modified to incorporate new requirements.

https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2024-09-06_11-21-27

Comment entered 2024-09-10 18:10:18 by Osei-Poku, William (NIH/NCI) [C]

The new URL for the Spanish summaries appears to end with a period that is not needed . Please remove it and rerun in test mode on DEV. Thanks!

"https://www.cancer.gov/espanol/investigacion/participe/estudios-clinicos."

Comment entered 2024-09-11 10:22:17 by Kline, Bob (NIH/NCI) [C]
Comment entered 2024-09-12 10:46:47 by Osei-Poku, William (NIH/NCI) [C]

Looks good. Please run in live mode on DEV. Thanks!

Comment entered 2024-09-13 08:40:27 by Kline, Bob (NIH/NCI) [C]

Done.

Comment entered 2024-09-23 11:50:50 by Osei-Poku, William (NIH/NCI) [C]

Looks good on DEV. Please run in test mode on QA. Thanks!

Comment entered 2024-09-23 15:25:51 by Kline, Bob (NIH/NCI) [C]
Comment entered 2024-09-27 09:11:53 by Osei-Poku, William (NIH/NCI) [C]

Verified. Please run in live mode on QA. Thanks!

Comment entered 2024-09-30 08:52:46 by Kline, Bob (NIH/NCI) [C]

Done.

Comment entered 2024-10-21 12:43:38 by Osei-Poku, William (NIH/NCI) [C]

Looks good on QA. Please run in test mode on PROD.

Comment entered 2024-10-21 16:27:23 by Kline, Bob (NIH/NCI) [C]
Comment entered 2024-10-30 10:23:45 by Osei-Poku, William (NIH/NCI) [C]

Verified. Please run in live mode on PROD. Thanks!

Comment entered 2024-10-30 14:26:52 by Kline, Bob (NIH/NCI) [C]

Done.

Comment entered 2024-11-04 11:50:44 by Osei-Poku, William (NIH/NCI) [C]

Verified on PROD. Thanks!

Attachments
File Name Posted User
ClinicalTrialURLLinks_SpanishPatSummaries.docx 2024-08-20 14:05:19 Saucedo, Linda (NIH/NCI) [C]
ClinicalTrialURLLinks_SpanishPatSummaries-1.docx 2024-08-21 08:52:04 Saucedo, Linda (NIH/NCI) [C]
Standard Wording Report_CT Information.docx 2024-08-15 14:34:02 Shields, Victoria (NIH/NCI) [E]

Elapsed: 0:00:00.001794