CDR Tickets

Issue Number 4649
Summary [Summaries] Board Roster Report Broken on PROD
Created 2019-08-12 10:14:56
Issue Type Bug
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2019-10-10 14:02:06
Resolution Won't Fix
Path /home/bkline/backups/jira/ocecdr/issue.248254
Description

I'm unable to generate the Board Roster Report (all flavors - full, summary, combined). I keep getting a Python error. Here's an example of the error:

 

A problem occurred in a Python script.

D:\cdr\Log\tmpoeacub.html contains the description of this error.

Comment entered 2019-08-12 10:41:02 by Kline, Bob (NIH/NCI) [C]

The problem is the OrganizationParent link from CDR27564 (the Organization document for NCI - Office of Cancer Content) to CDR27606 (NCI - Office of Communications and Public Liaison), which was removed as part of the streamlining project. I'm looking over the tickets to refresh my memory for what decisions/actions were taken about parent organizations. I guess this was one of the things we didn't test on the lower tiers.

Comment entered 2019-08-12 11:04:52 by Kline, Bob (NIH/NCI) [C]

I have found the trail. We decided in OCECDR-4478 to drop Organization documents other than PDQ boards and orgs linked directly by Person docs we're keeping, and in OCECDR-4538 to drop the links to the Organization parent documents which have been removed. I'm not sure why the global hasn't been run on PROD yet for that second step, but I think the report will work after it has been run. You could edit CDR27564 and remove the OrganizationParent element manually to get the report to run, I think.

Comment entered 2019-08-12 11:38:50 by Juthe, Robin (NIH/NCI) [E]

Thanks, Bob. Unfortunately, I'm still getting an error. (I removed the OrganizationParent element from CDR27564 and created a valid, publishable version of the org record.) Sorry. Here's an example of the Python error this time:

A problem occurred in a Python script.

D:\cdr\Log\tmpgrl7dh.html contains the description of this error.

Comment entered 2019-08-12 11:45:22 by Kline, Bob (NIH/NCI) [C]

Now it's choking on CDR32676 (American College of Surgeons Oncology Group) which links to the removed CDR35641 (American College of Surgeons). You could do these one by one, but can you think of a reason not to have Volker proceed with the global change on PROD?

Comment entered 2019-08-12 12:00:13 by Juthe, Robin (NIH/NCI) [E]

I'm concerned we're going to lose (or perhaps have lost) some level of specificity with regard to our Board members' addresses. Sometimes I think we have relied on a parent org to provide the address if it is the same as the main organization. I just reviewed the links to obsolete docs file in OCECDR-4478. Would it be possible to generate a file that shows which links to obsolete docs are associated with an active Board member? Then we'd have an idea how many Board members/org docs are affected. Does that make sense?

Comment entered 2019-08-12 14:46:17 by Kline, Bob (NIH/NCI) [C]

I have been wrestling with this report request for a good while (I don't have a good handle on what "associated with an active Board member" means, and following every link from every active board member document recursively is looking more and more like a rabbit hole). Can we defer this revisiting of the decision to drop the parent organizations until Kepler?

Comment entered 2019-08-12 15:43:36 by Juthe, Robin (NIH/NCI) [E]

I was trying to get a sense of how many documents we'd need to manually update. I'm okay with deferring the global discussion but we'll need to get the roster reports working again before then - we use them all the time. I'll manually edit the document you mentioned above and see if that happens to do it, but I'm afraid we might just run into more documents that need updating.

Comment entered 2019-08-12 15:50:08 by Kline, Bob (NIH/NCI) [C]

What's the downside of having Volker run the script to remove the broken links? I expect that will fix the broken report.

Comment entered 2019-08-12 15:59:15 by Juthe, Robin (NIH/NCI) [E]

I didn't realize the second part of the global is just removing the links - the docs have all been removed. I guess there isn't any downside to removing the broken links now since that's the same thing I'm doing manually. Thanks.

Comment entered 2019-08-12 16:22:21 by Kline, Bob (NIH/NCI) [C]

Volker reminded me that the change to zip code validation has to be applied before the global can be run. Otherwise some of the modified documents will be marked as invalid which shouldn't be.

Comment entered 2019-08-12 17:41:49 by Kline, Bob (NIH/NCI) [C]

OK, I've been tweaking the code for this report repeatedly until I've gotten it down to one more broken link, from CDR28945 (NCI - Division of Cancer Treatment and Diagnosis) to CDR27785 (NCI - Division of Cancer Treatment and Diagnosis). However, this assumes more knowledge about how all the filters –  particularly the denormalization filters – actually work than I actually have. I have skipped over a lot of links, like links for membership in an ad-hoc group or CCOP, or a preferred protocol organization, but without a meticulous examination of every branch in the filtering logic, it's impossible to know which links the filters are trying to follow, and for which of those it throws an error instead of just moving on.

CDR369939 => CDR9130 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR9130 => CDR35518 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR35518 => CDR28945 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR28945 => CDR27785 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
LINK TO CDR27785 IS BROKEN

You might want to run on DEV (where I believe the global change has been run) the same report which is failing on PROD, just to verify that the global change actually does solve the failures you ran into this morning. You could also have Volker re-run the global on STAGE (which was refreshed from PROD the other day) and test the report there.

Comment entered 2019-08-12 18:43:26 by Kline, Bob (NIH/NCI) [C]

I ran the report again with another slight tweak, and found four broken links:

 

======================================================================
CDR369780 => CDR22087 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR22087 => CDR27766 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR27766 => CDR37398 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
LINK TO CDR37398 IS BROKEN
======================================================================
======================================================================
CDR369780 => CDR22087 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR22087 => CDR35250 using /Person/PersonLocations/OtherPracticeLocation/ComplexAffiliation/Organization/@cdr:ref
CDR35250 => CDR35694 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
LINK TO CDR35694 IS BROKEN
======================================================================
======================================================================
CDR369844 => CDR25318 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR25318 => CDR34983 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR34983 => CDR29106 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
LINK TO CDR29106 IS BROKEN
======================================================================
======================================================================
CDR369939 => CDR9130 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR9130 => CDR35518 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR35518 => CDR28945 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR28945 => CDR27785 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
LINK TO CDR27785 IS BROKEN
======================================================================
Comment entered 2019-08-13 05:01:00 by Kline, Bob (NIH/NCI) [C]

I have confirmed that the filters are trying to resolve more than just the person (or member) document TO organization doc TO parent organization doc chains, even though it's likely that most of the links are not needed for this report. I tried running the report on STAGE, which was recently cloned from PROD. The first error was the same problem Robin hit when she first tried running the report on PROD (member doc TO board doc TO parent org doc). But after I removed that link the next error was board member doc TO other practice location org doc TO member of ad-hoc-group doc. When I tried deleting that link the saved document was invalid because of the zip code. The basic problem is that the report runs the standard denormalization filters, and those basically follow every link which might conceivably be needed (including possibly some links for protocol-related information which we know we don't need).

The options that I can think of include:

  1. Hold off on running the report until Joule is promoted to production

  2. Refresh QA, re-apply Joule, and run the report on QA

  3. Apply Joule to STAGE and run the report on STAGE

  4. Ask CBIIT to apply just a patch for OCECDR-4528 (zip code validation) to PROD and run the global for OCECDR-4538

  5. Have Volker determine whether it is feasible to modify the filters to skip over broken links for things we don't need

Comment entered 2019-08-13 05:06:14 by Kline, Bob (NIH/NCI) [C]

I have attached a report which looks for all the broken links which can be reached recursively from the PDQBoardMemberInfo or Person documents for the active board members, with the complete chain for the first time the broken link was found. There are 220 broken links in the report. I did this on STAGE before deleting any of the links in the attempt to run the report on STAGE.

Comment entered 2019-08-13 09:59:18 by Juthe, Robin (NIH/NCI) [E]

I ran the roster reports on DEV and they worked fine. However, I compared the results with a relatively recent version of the report from PROD (from May 2019) and noticed some discrepancies that seem like they could be due to lost parent orgs. The differences are almost entirely among NIH staff addresses, where we've listed a division, NCI, NIH, etc. I'm attaching a document highlighting the differences.

Comment entered 2019-08-13 10:01:50 by Juthe, Robin (NIH/NCI) [E]

I think these could all be corrected manually by adding the additional org levels to the address in the person record - so not a deal breaker - but it may be worth running a diff report or something on PROD when we get to that point rather than relying on my eyes and this imperfect DEV/PROD comparision. 🙂

Comment entered 2019-08-13 10:48:06 by Juthe, Robin (NIH/NCI) [E]

Thanks for the report. I was hoping to spot check a handful of these documents but I'm not able to get XMetaL on STAGE to open. It spins and times out.

Comment entered 2019-10-10 14:01:50 by Juthe, Robin (NIH/NCI) [E]

We are manually correcting these addresses in the person records.

Attachments
File Name Posted User
broken-links-stage.txt 2019-08-13 05:01:55 Kline, Bob (NIH/NCI) [C]
Comparison of Board Roster Reports.docx 2019-08-13 09:59:24 Juthe, Robin (NIH/NCI) [E]

Elapsed: 0:00:00.001394