CDR Tickets

Issue Number 3501
Summary [GenProf] Global Change to remove terms from Genetics Professional Person records.
Created 2012-04-19 18:12:38
Issue Type Improvement
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To alan
Status Closed
Resolved 2013-09-10 16:46:03
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107829
Description

BZISSUE::5196
BZDATETIME::2012-04-19 18:12:38
BZCREATOR::William Osei-Poku
BZASSIGNEE::Alan Meyer
BZQACONTACT::William Osei-Poku

We need to remove CDR0000654587 familial carcinoid syndrome
and CDR0000654671 – osteochondromatosis from over 200 Gen Prof person records. The menu items for these terms have been removed which is making the records on Cancer.gov to show empty cells. I was hoping to use the Global Change Links utility to make these changes but it wouldn't work because I need replacement terms in order to be able to run the program successfully. It will be helpful to have a general purpose global to do this kind of globals in the future. If the existing utility can be modified to remove terms from records, that will be helpful. A new general purpose global for removing the terms should also work.

Samples of documents on cancer.gov with empty rows:
http://cancer.gov/cancertopics/genetics/directory/view?personid=665060
http://cancer.gov/cancertopics/genetics/directory/view?personid=1558

Comment entered 2012-05-22 15:34:26 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2012-05-22 15:34:26
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::1

Lowered priority to P6

Comment entered 2013-04-04 11:20:11 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2013-04-04 11:20:11
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::2

Added the GenProf tag.

Comment entered 2013-07-11 12:36:08 by alan

This has been sitting around for a long time, so I looked at it long enough to figure out what's going on and what needs to be done.

My sense is that it will take no more time to do this the ideal way that William would like (modify the Global Change Links utility to permit deleting a link without adding one) than to write a custom one-off global change.

I think it will require less than a day for me to make and test the changes. CIAT will then want to test pretty carefully since the program will be able to process lots more documents than a one-off global can do.

The priority on this is a P6 but, since I'm proposing a more general fix for the problem than just GenProf docs, we might want to raise it to a P5 and go ahead with it.

I won't do anything more unless someone raises the priority and says to proceed.

Comment entered 2013-07-16 22:22:41 by alan

The program to modify is cdr\lib\Python\GlobalChangeLinkBatch.py.

Currently, the program asks the user (line 474) to:

"Replace this link with one or more others"
or
"Keep this link and add one or more others"

and sets the cgi variable "replaceOld" to "Yes" or "No".

The program always prompts for new links (line 517). This needs
to be modified to only do this if the user has said to add them.

The program always adds any new links before deleting the old
(line 939). This probably requires no modification since there
will be no links to add and nothing done.

Then, if replaceOld is set to "Yes", it finds the parent of the
element to replace and removes the child from it (line 962).

Possible changes:

Method 1:

Convert to two checkboxes:

[ ] Check to delete this link, leave unchecked to retain it
[ ] Add one or more new links

Method 2:

Convert to three radio buttons, the same two plus a new one:

o Delete this link and do nothing else
o Replace this link with one or more others
o Keep this link and add one or more others

William:

If users have a preference, let me know. Otherwise I'll do
whichever seems easier. Method 1 might be slightly easier, but
I'm not sure and I doubt if there is much difference.

Comment entered 2013-07-17 15:05:12 by Osei-Poku, William (NIH/NCI) [C]

I have reviewed what you wrote above and I don't see any difference between the two approaches in terms of what they do. They appear to accomplish the same thing. So, please proceed with what is easier for you.

Comment entered 2013-08-30 00:07:35 by alan

I have completed the programming for this and my testing with
"osteochondromatosis" on DEV in test mode appeared to work. 298 documents
were processed in each of two tests.

I used a combination of the two user interface techniques mentioned in my
earlier comment because it saved a bit of implementation time, but I don't
think anyone will find it more or less intuitive than the two I described.

I did note a suprise when processing CDR0000668171 (Kara Bui). That document
(on DEV) has two instances of "osteochondromatosis". The global change
program will do the following with that record:

If deleting the link:
Both copies of the link are deleted.
That should be fine.

If replacing the link:
Both copies of the link will be replaced, causing the record to
contain two copies of a different link instead.
That would have to be fixed by hand.

The original implementation of the software must not have foreseen this sort
of data problem and wasn't equipped to handle it. If we were to try to fix
this, it might be better to fix it in a validation routine rather than in the
global change, but it's a non-trivial problem that I would think we don't want
to undertake at this time. I presume the error is rare.

Links to test results on DEV are provided below. To use the links it will be
necessary to use them from a browser on the DEV bastion host.

Replacing the link with a new link:

https://cdr.dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2013-08-29_23-00-16

Deleting the link with no replace:

https://cdr.dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2013-08-29_23-33-29

Comment entered 2013-08-30 00:09:05 by alan

Ready for QA on DEV.

Comment entered 2013-09-03 18:42:53 by Osei-Poku, William (NIH/NCI) [C]

I tested the enhancement on Dev. and it is working well. I just noticed a two minor issues:

1. The program works well when deleting a term without adding a new term and you have a CDR ID ready. However, if you don't have a CDR ID and you have to type in the leading characters in the search field, it returns the following error
"Unable to convert 'None' into a CDR Document ID".

I must say that in almost all cases, we will have the CDR ID before performing the global so if it will take too much time to fix this, then you can probably decide to either not fix it or fix it later as it won't stop us from using the global.

2. Another minor issue that I saw was that FamilialCancerSyndromes (plural) has been listed in addition to FamilialCancerSyndrome (singular) as part of the elements for the person document type. But it looks like FamilialCancerSyndromes (plural) is not in use in the current Gen Prof schema. It may have been used in the past and in fact, there is one document 663185 which contains that element. However, the document is not well formed probably because of the element. This is true on the production server as well. Once again, this is not a major issue but if the plural form of the element can be removed eventually, that should be fine.

I also wanted to find out if you want me to test with other document types as well? At this point, I am only testing with the person document type.

Comment entered 2013-09-03 21:43:36 by alan

I've reopened the issue and will have a look at these. If they are easy, and I think at least the first one should be, I'll fix them.

The second issue is trickier. The picklist is generated from the schema, not from a list in the program. I'll try to figure out what's going on before the status meeting on Thursday if I can.

Comment entered 2013-09-05 15:19:59 by alan

As I reported at our status meeting, my previous comment about the picklist being generated from the schema was mistaken. It's generated from one of the link type tables (link_xml). I edited the "Term" link type definition to remove "FamilialCancerSyndromes" as a legal source of the link type and tested the program. That fixed the problem of having that obsolete element name in the picklist.

I checked the database on DEV and found no instances of FamilialCancerSyndromes in any current Term. I presume the same would be true on the more up to date PROD database.

I went ahead and updated the link type definitions on PROD and QA.

Comment entered 2013-09-10 16:46:03 by alan

This program is probably trickier than it needed to be (I was younger when I wrote it) and my change to enable deletion of a term without adding new ones turned out to have subtle ramifications that took me a while to track down.

I think it's right now. I'm marking it as resolved fix, ready for user testing again.

Comment entered 2013-09-10 16:50:12 by alan

Note to William or other testers:

The recent fix had to do with entering data in the data entry screens and producing the "Unable to convert 'None' into a CDR Document ID" error that William saw. I did not change any of the code that made the actual transformations since the last change so, while it's never a bad idea to do extra testing, the main thing to test would be entering data.

The changed code is on DEV right now. It has not yet been moved to QA.

Comment entered 2013-09-17 19:22:19 by Osei-Poku, William (NIH/NCI) [C]

Verified all changes on DEV.

Comment entered 2013-11-08 11:11:09 by Osei-Poku, William (NIH/NCI) [C]

Verified on QA.

Comment entered 2013-11-26 19:41:22 by Osei-Poku, William (NIH/NCI) [C]

Verified on Prod.

Elapsed: 0:00:00.001598