CDR Tickets

Issue Number 5139
Summary [LOE Adult/Peds] Global replace of LOEs in adult and pediatric summaries
Created 2022-09-15 12:56:53
Issue Type Task
Submitted By Shields, Victoria (NIH/NCI) [E]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2022-09-23 12:51:39
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.327524
Description

The Levels of Evidence system used by the Adult and Pediatric Treatment Boards has been revised. The current Levels of Evidence included in the summaries, tagged as LOERefs in the CDR, need to be updated with the new Levels. 

An example of the current format used in the text is:

[Level of evidence: 1iiA]

The corresponding term is:

Level of evidence 1iiA

An example of the new format used in the text is:

[Level of evidence A1]

The corresponding term is the same:

Level of evidence A1

The terms have been created in the CDR and published, in both English and Spanish. 

Attached is a Word table that shows the mapping between the current and new LOEs.

 Levels of Evidence_Adult_Peds_Mapping Old to New_09_15_2022.docx

Comment entered 2022-09-15 13:25:04 by Kline, Bob (NIH/NCI) [C]

Software can't reliably pull values from a table in a Word document. I have pasted the table into the attached Excel workbook.

What should we do with the row with a question mark for the new LOE?

I assume we're going straight to production (in test mode at first) since that's where the new values are loaded, right?

Comment entered 2022-09-15 13:37:15 by Shields, Victoria (NIH/NCI) [E]

EEK! That question mark shouldn't be there. Sorry about that!

Yes, the terms are on PROD.

Comment entered 2022-09-15 16:16:18 by Kline, Bob (NIH/NCI) [C]

New workbook attached. Had to do a bunch of tweaking to the name values to get them to match.

Comment entered 2022-09-21 12:46:11 by Osei-Poku, William (NIH/NCI) [C]

Thanks, Bob! Hi , if it is OK I will ask Stacy and team to review the new spreadsheet and proceed with the global on QA once they are done with the review.

Comment entered 2022-09-21 15:03:38 by Shields, Victoria (NIH/NCI) [E]

Yes, , please talk to Stacy and proceed with the global replace on QA. Thanks.

Comment entered 2022-09-22 13:17:09 by Osei-Poku, William (NIH/NCI) [C]

Stacy has finished reviewing the terms in the spreadsheet and confirmed that they are good to go. So we can proceed with running in test mode on QA. Thanks!

Comment entered 2022-09-22 13:18:20 by Osei-Poku, William (NIH/NCI) [C]


What should we do with the row with a question mark for the new LOE?

I assume this is no longer and issue since you're using the CDR ID instead of the term names?

Comment entered 2022-09-22 13:30:01 by Kline, Bob (NIH/NCI) [C]

I got my answer to that question both in the response from Victoria below, as well as in last Thursday's meeting. Reflected in the latest spreadsheet.

Comment entered 2022-09-23 09:32:24 by Kline, Bob (NIH/NCI) [C]

Just so you know, I ran across some summaries which conflicted with my picture of how the Spanish summaries were supposed to work. I thought I had been told that the real boards weren't linked directly in the Spanish summaries, but instead the PDQBoard links were to a fake board, and that the only way to find out what the real editorial board for a Spanish summary is was to follow the TranslationOf link and pull the editorial board out of the English summary of which this is a translation. However, in assembling the logic to identify the summaries which should be processed for this global change, I came across six summaries which had direct links to the real PDQ Editorial boards AS WELL AS a TranslationOf link to the English summary.

 

SELECT distinct t.doc_id AS "Doc ID"
  FROM query_term t
  JOIN query_term b
    ON b.doc_id = t.doc_id
 WHERE t.path = '/Summary/TranslationOf/@cdr:ref'
   AND b.path = '/Summary/SummaryMetaData/PDQBoard/Board/@cdr:ref'
   AND b.int_val IN (28327, 28557)
   AND t.int_val IN (
    SELECT DISTINCT doc_id
               FROM query_term
              WHERE path = '/Summary/SummaryMetaData/PDQBoard/Board/@cdr:ref'
                AND int_val IN (28327, 28557)
)

Doc ID

611985

772163

800324

800326

800370

800372

Not going to impede my progress on the global change, but I thought it possible that someone might want to be aware of these anomalies.

Comment entered 2022-09-23 11:02:30 by Osei-Poku, William (NIH/NCI) [C]

Thanks, Bob! They have now been fixed on PROD. 772163 is an English summary so it should have a link to a real board. However it is a Temp Doc that has been abandoned and can be deleted from the CDR.

Comment entered 2022-09-23 12:54:55 by Kline, Bob (NIH/NCI) [C]

JIRA appears to have discarded my previous comment. Test mode on QA has completed.

https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2022-09-23_12-08-56

One thing you can do to make it a little easier to find the changes is to search for the caret character (^) which is used to mark the differences.

Comment entered 2022-09-27 12:21:30 by Osei-Poku, William (NIH/NCI) [C]

Please run the global in live mode on QA. Thanks!

Comment entered 2022-09-27 14:12:17 by Kline, Bob (NIH/NCI) [C]

Done.

2022-09-27 14:04:17.020 [INFO] Run completed.
   Docs examined    = 334
   Docs changed     = 334
   Versions changed = 726
   Could not lock   = 0
   Errors           = 0
   Time             = 0:54:34.082336
Specific versions saved:
  new cwd = 69
  new pub = 242
  new ver = 150
  old cwd = 130
Comment entered 2022-09-28 09:51:55 by Osei-Poku, William (NIH/NCI) [C]

The text within the LOERef elements in the Spanish summaries should read "Nivel de evidencia ..." as in the glossary terms (CDR0000810025, example). They are currently displaying the English text "Level of evidence .." (CDR0000256668, example).

Comment entered 2022-09-28 10:20:14 by Kline, Bob (NIH/NCI) [C]
  1. I didn't see anything in the ticket with that requirement.

  2. The live mode did exactly what the test mode did.

Comment entered 2022-09-28 11:04:18 by Shields, Victoria (NIH/NCI) [E]

I think I should have included a document that mapped the Spanish terms (old to new) like I did for the English. If I provide that, would you be able to update the Spanish summaries? And should I open a new ticket for this part of the task? Sorry I missed that when I created this ticket.

Comment entered 2022-09-28 11:41:38 by Kline, Bob (NIH/NCI) [C]

What you would need to do, I think, is provide a seventh column to the latest spreadsheet (ocecdr-5139-names-and-ids.xlsx) with the Spanish names, unless they can all be reliably derived from the English names by mechanically replacing "Level of evidence " with "Nivel de evidencia " in the Spanish summaries.

We can't just perform the live run a second time, because the document IDs for the terms which the script is looking for aren't there any more. That's why it's unfortunate that the requirement didn't make it into the original ticket nor was it caught in the review of the test-mode run. William would have to put in another ticket for Volker to refresh QA again in order to do another live-mode run. And if you go that route I would strongly recommend that another test run be performed and carefully reviewed before running the job in live mode again.

The alternative to refreshing QA again is to create another global change job to replace the text content of the elements in the Spanish summaries. For that we'd need the map of new IDs to Spanish strings. If we go this route, we'd want to avoid fixing the original script for this ticket, because in order for what we test on QA to be of any use in verifying that what we will do on PROD will be correct is to run the unaltered first script on production, creating the wrong term names for the Spanish summaries, and then run the second global change job to fix that problem.

Make sense?

Comment entered 2022-09-28 12:10:01 by Osei-Poku, William (NIH/NCI) [C]

I will ask Linda to update the spreadsheet if the text cannot be replaced with "Nivel de evidencia " in all cases.  I will also create another ticket for QA to be refreshed so we go through another test run and careful review again before a live run on QA.

Comment entered 2022-09-28 13:37:31 by Osei-Poku, William (NIH/NCI) [C]

Linda confirmed that the text will read "Nivel de evidencia" in all cases. Does that mean there is no need to update the spreadsheet? Will all the different level values display correctly even without the updated spreadsheet?

Comment entered 2022-09-28 14:15:53 by Kline, Bob (NIH/NCI) [C]

Yes, assuming all the software needs to do is exactly what it did during the previous two runs, except replace "Level of evidence " in the name of the text value to "Nivel de evidencia " for the Spanish summaries.

Comment entered 2022-09-29 17:02:58 by Kline, Bob (NIH/NCI) [C]

Another test mode done on QA:

2022-09-29 16:46:58.168 [INFO] Run completed.
   Docs examined    = 334
   Docs changed     = 0
   Versions changed = 910
   Could not lock   = 0
   Errors           = 0
   Time             = 0:37:05.709539

I see that they used to call it "Grado de comprobación."

https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2022-09-29_16-09-52

Comment entered 2022-10-04 11:16:06 by Osei-Poku, William (NIH/NCI) [C]

Review of test results is complete. Please run in live mode on QA. Thanks!

Comment entered 2022-10-04 16:03:11 by Kline, Bob (NIH/NCI) [C]

Done.

2022-10-04 16:00:55.104 [INFO] Run completed.
   Docs examined    = 334
   Docs changed     = 334
   Versions changed = 726
   Could not lock   = 0
   Errors           = 0
   Time             = 0:56:17.399866
Specific versions saved:
  new cwd = 69
  new pub = 242
  new ver = 150
  old cwd = 134

As a side note, I noticed that we have a LOT of blocked summary documents.

Comment entered 2022-10-07 11:37:30 by Osei-Poku, William (NIH/NCI) [C]

Please run the global in test mode on PROD. Thanks!

Comment entered 2022-10-07 14:52:03 by Kline, Bob (NIH/NCI) [C]

Done.

2022-10-07 14:47:04.547 [INFO] Run completed.
   Docs examined    = 334
   Docs changed     = 0
   Versions changed = 909
   Could not lock   = 0
   Errors           = 0
   Time             = 0:53:48.420192

https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2022-10-07_13-53-16

Comment entered 2022-10-11 16:57:50 by Osei-Poku, William (NIH/NCI) [C]

Please run in live mode on PROD.

Comment entered 2022-10-12 12:14:14 by Kline, Bob (NIH/NCI) [C]

Done.

2022-10-12 12:12:10.168 [INFO] Run completed.
   Docs examined    = 318
   Docs changed     = 317
   Versions changed = 682
   Could not lock   = 0
   Errors           = 0
   Time             = 1:25:18.408458
Specific versions saved:
  new cwd = 69
  new pub = 232
  new ver = 133
  old cwd = 133
Comment entered 2022-10-20 11:30:22 by Osei-Poku, William (NIH/NCI) [C]

Hi  

Do you know why these two summaries appear to have not been updated on PROD?

779396 and 779398

Comment entered 2022-10-20 12:58:28 by Kline, Bob (NIH/NCI) [C]

That was a bug in the script's query. Unfortunately, we didn't run the job in test mode on PROD, so we didn't catch it.

Comment entered 2022-10-20 13:16:39 by Osei-Poku, William (NIH/NCI) [C]

We ran it in test mode on PROD. Please see comments below. Would you be able to run an ad hoc query to identify only the ones that were not updated? I ran a simple query that identified these docs below, but I am not sure if that is the complete list.

779396
779398
778295
777844
780682
781009
781609
780118
801593

Comment entered 2022-10-20 14:48:18 by Kline, Bob (NIH/NCI) [C]

Unfortunately, we didn't run the job in test mode on PROD, ...

I think JIRA's comment-suppression bug was just doing its thing. 😛

 

25 summaries still need to be processed
777844
778295
779396
779398
780118
780682
781009
781609
805701
805704
805868
806272
806827
807006
808521
808522
809267
810147
810237
810726
810727
810728
810743
810760
810761
Comment entered 2022-10-20 15:27:59 by Osei-Poku, William (NIH/NCI) [C]

Thanks, Bob! Please let's proceed to run the global for these documents. I should have been specific to run in test mode for these documents.

Comment entered 2022-10-20 15:31:15 by Osei-Poku, William (NIH/NCI) [C]

Before you run the global in mode for the identified documents. Could you also check why this document CDR62941 is not on the list. It is not a Module Only document and I expected to see it on the list.

Comment entered 2022-10-20 16:37:43 by Kline, Bob (NIH/NCI) [C]

Don't think ModuleOnly would make a difference for this global. There are no LOERef rows in the query_term table for that document on PROD. I can think of no reason why that would be, as I can see that there are four LOERef elements in the current working document. Will keep digging.

Comment entered 2022-10-27 15:17:28 by Kline, Bob (NIH/NCI) [C]

We talked in this afternoon's meeting about proceeding with the followup global on prod, but we still haven't unraveled the mystery for why 62941 doesn't have any rows in the query_term table for LOERef links, even though the document has those elements. Do you want me to proceed anyway?

Comment entered 2022-10-27 17:29:53 by Kline, Bob (NIH/NCI) [C]

OK, I finally tracked down why this document doesn't have any rows in the query_term table for LOERef elements. All four of the LOERef element are deeply nested inside Insertion elements whose RevisionLevel causes the revisions to be backed out for the resolved version of the document on which the query_term indexing is performed. So the query_term table is as it should be. I will proceed with the followup test job.

Comment entered 2022-10-27 17:40:09 by Osei-Poku, William (NIH/NCI) [C]

Yes, proceed to run it in test-mode on PROD. Would you be able to include this one too CDR62941 ?

Comment entered 2022-10-28 09:16:34 by Kline, Bob (NIH/NCI) [C]

CDR62941 manually included.

2022-10-28 09:14:00.857 [INFO] Run completed.
   Docs examined    = 26
   Docs changed     = 0
   Versions changed = 58
   Could not lock   = 0
   Errors           = 0
   Time             = 0:03:33.543087
Comment entered 2022-10-31 09:32:55 by Osei-Poku, William (NIH/NCI) [C]

Pleaser run in live mode on PROD. Thanks!

Comment entered 2022-10-31 09:56:44 by Kline, Bob (NIH/NCI) [C]

Done.

2022-10-31 09:54:27.212 [INFO] Run completed.
   Docs examined    = 26
   Docs changed     = 23
   Versions changed = 52
   Could not lock   = 0
   Errors           = 0
   Time             = 0:05:58.964680
Specific versions saved:
  new cwd = 6
  new pub = 13
  new ver = 16
  old cwd = 8
Comment entered 2022-11-02 10:36:55 by Osei-Poku, William (NIH/NCI) [C]

Closing this ticket as all docs appear to be OK. Thank you!!

Attachments
File Name Posted User
Levels of Evidence_Adult_Peds_Mapping Old to New_09_15_2022.docx 2022-09-15 12:56:24 Shields, Victoria (NIH/NCI) [E]
ocecdr-5139.xlsx 2022-09-15 13:23:06 Kline, Bob (NIH/NCI) [C]
ocecdr-5139-names-and-ids.xlsx 2022-09-15 16:15:23 Kline, Bob (NIH/NCI) [C]

Elapsed: 0:00:00.001238