Issue Number | 4649 |
---|---|
Summary | [Summaries] Board Roster Report Broken on PROD |
Created | 2019-08-12 10:14:56 |
Issue Type | Bug |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2019-10-10 14:02:06 |
Resolution | Won't Fix |
Path | /home/bkline/backups/jira/ocecdr/issue.248254 |
I'm unable to generate the Board Roster Report (all flavors - full, summary, combined). I keep getting a Python error. Here's an example of the error:
A problem occurred in a Python script.
D:\cdr\Log\tmpoeacub.html contains the description of this error.
The problem is the OrganizationParent
link from CDR27564
(the Organization
document for NCI - Office of Cancer
Content) to CDR27606 (NCI - Office of Communications and Public
Liaison), which was removed as part of the streamlining project.
I'm looking over the tickets to refresh my memory for what
decisions/actions were taken about parent organizations. I guess this
was one of the things we didn't test on the lower tiers.
I have found the trail. We decided in OCECDR-4478 to drop
Organization
documents other than PDQ boards and orgs
linked directly by Person docs we're keeping, and in OCECDR-4538 to drop
the links to the Organization parent documents which have been removed.
I'm not sure why the global hasn't been run on PROD yet for that second
step, but I think the report will work after it has been run. You could
edit CDR27564 and remove the OrganizationParent
element
manually to get the report to run, I think.
Thanks, Bob. Unfortunately, I'm still getting an error. (I removed the OrganizationParent element from CDR27564 and created a valid, publishable version of the org record.) Sorry. Here's an example of the Python error this time:
A problem occurred in a Python script.
D:\cdr\Log\tmpgrl7dh.html contains the description of this error.
Now it's choking on CDR32676 (American College of Surgeons Oncology Group) which links to the removed CDR35641 (American College of Surgeons). You could do these one by one, but can you think of a reason not to have Volker proceed with the global change on PROD?
I'm concerned we're going to lose (or perhaps have lost) some level of specificity with regard to our Board members' addresses. Sometimes I think we have relied on a parent org to provide the address if it is the same as the main organization. I just reviewed the links to obsolete docs file in OCECDR-4478. Would it be possible to generate a file that shows which links to obsolete docs are associated with an active Board member? Then we'd have an idea how many Board members/org docs are affected. Does that make sense?
I have been wrestling with this report request for a good while (I don't have a good handle on what "associated with an active Board member" means, and following every link from every active board member document recursively is looking more and more like a rabbit hole). Can we defer this revisiting of the decision to drop the parent organizations until Kepler?
I was trying to get a sense of how many documents we'd need to manually update. I'm okay with deferring the global discussion but we'll need to get the roster reports working again before then - we use them all the time. I'll manually edit the document you mentioned above and see if that happens to do it, but I'm afraid we might just run into more documents that need updating.
What's the downside of having Volker run the script to remove the broken links? I expect that will fix the broken report.
I didn't realize the second part of the global is just removing the links - the docs have all been removed. I guess there isn't any downside to removing the broken links now since that's the same thing I'm doing manually. Thanks.
Volker reminded me that the change to zip code validation has to be applied before the global can be run. Otherwise some of the modified documents will be marked as invalid which shouldn't be.
OK, I've been tweaking the code for this report repeatedly until I've gotten it down to one more broken link, from CDR28945 (NCI - Division of Cancer Treatment and Diagnosis) to CDR27785 (NCI - Division of Cancer Treatment and Diagnosis). However, this assumes more knowledge about how all the filters – particularly the denormalization filters – actually work than I actually have. I have skipped over a lot of links, like links for membership in an ad-hoc group or CCOP, or a preferred protocol organization, but without a meticulous examination of every branch in the filtering logic, it's impossible to know which links the filters are trying to follow, and for which of those it throws an error instead of just moving on.
=> CDR9130 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR369939 => CDR35518 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR9130 => CDR28945 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR35518 => CDR27785 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR28945 LINK TO CDR27785 IS BROKEN
You might want to run on DEV (where I believe the global change has been run) the same report which is failing on PROD, just to verify that the global change actually does solve the failures you ran into this morning. You could also have Volker re-run the global on STAGE (which was refreshed from PROD the other day) and test the report there.
I ran the report again with another slight tweak, and found four broken links:
======================================================================
=> CDR22087 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR369780 => CDR27766 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR22087 => CDR37398 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR27766
LINK TO CDR37398 IS BROKEN======================================================================
======================================================================
=> CDR22087 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR369780 => CDR35250 using /Person/PersonLocations/OtherPracticeLocation/ComplexAffiliation/Organization/@cdr:ref
CDR22087 => CDR35694 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR35250
LINK TO CDR35694 IS BROKEN======================================================================
======================================================================
=> CDR25318 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR369844 => CDR34983 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR25318 => CDR29106 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR34983
LINK TO CDR29106 IS BROKEN======================================================================
======================================================================
=> CDR9130 using /PDQBoardMemberInfo/BoardMemberName/@cdr:ref
CDR369939 => CDR35518 using /Person/PersonLocations/OtherPracticeLocation/OrganizationLocation/@cdr:ref
CDR9130 => CDR28945 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR35518 => CDR27785 using /Organization/OrganizationParent/ParentOrganization/@cdr:ref
CDR28945
LINK TO CDR27785 IS BROKEN======================================================================
I have confirmed that the filters are trying to resolve more than just the person (or member) document TO organization doc TO parent organization doc chains, even though it's likely that most of the links are not needed for this report. I tried running the report on STAGE, which was recently cloned from PROD. The first error was the same problem Robin hit when she first tried running the report on PROD (member doc TO board doc TO parent org doc). But after I removed that link the next error was board member doc TO other practice location org doc TO member of ad-hoc-group doc. When I tried deleting that link the saved document was invalid because of the zip code. The basic problem is that the report runs the standard denormalization filters, and those basically follow every link which might conceivably be needed (including possibly some links for protocol-related information which we know we don't need).
The options that I can think of include:
Hold off on running the report until Joule is promoted to production
Refresh QA, re-apply Joule, and run the report on QA
Apply Joule to STAGE and run the report on STAGE
Ask CBIIT to apply just a patch for OCECDR-4528 (zip code validation) to PROD and run the global for OCECDR-4538
Have Volker determine whether it is feasible to modify the filters to skip over broken links for things we don't need
I have attached a report which looks for all the broken links which
can be reached recursively from the PDQBoardMemberInfo
or
Person
documents for the active board members, with the
complete chain for the first time the broken link was found. There are
220 broken links in the report. I did this on STAGE before deleting any
of the links in the attempt to run the report on STAGE.
I ran the roster reports on DEV and they worked fine. However, I compared the results with a relatively recent version of the report from PROD (from May 2019) and noticed some discrepancies that seem like they could be due to lost parent orgs. The differences are almost entirely among NIH staff addresses, where we've listed a division, NCI, NIH, etc. I'm attaching a document highlighting the differences.
I think these could all be corrected manually by adding the additional org levels to the address in the person record - so not a deal breaker - but it may be worth running a diff report or something on PROD when we get to that point rather than relying on my eyes and this imperfect DEV/PROD comparision. 🙂
Thanks for the report. I was hoping to spot check a handful of these documents but I'm not able to get XMetaL on STAGE to open. It spins and times out.
We are manually correcting these addresses in the person records.
File Name | Posted | User |
---|---|---|
broken-links-stage.txt | 2019-08-13 05:01:55 | Kline, Bob (NIH/NCI) [C] |
Comparison of Board Roster Reports.docx | 2019-08-13 09:59:24 | Juthe, Robin (NIH/NCI) [E] |
Elapsed: 0:00:00.001394