Issue Number | 3173 |
---|---|
Summary | [Terminology] Mark Drug/Agent Category terms as obsolete and Block from publishing |
Created | 2010-06-04 14:46:33 |
Issue Type | Improvement |
Submitted By | Grama, Lakshmi (NIH/NCI) [E] |
Assigned To | alan |
Status | Closed |
Resolved | 2010-09-09 11:49:35 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107501 |
BZISSUE::4860
BZDATETIME::2010-06-04 14:46:33
BZCREATOR::Lakshmi Grama
BZASSIGNEE::Alan Meyer
BZQACONTACT::William Osei-Poku
Terms with Semantic Type of Drug/Agent Category are legacy terms that we no longer maintain. We would like to globally change the Term Type of these terms to Obsolete Terms and Block them from publishing. Prior to this please double check that there are no documents linking to terms of this semantic type. I don't believe we have used these terms in the past.
BZDATETIME::2010-06-09 00:42:23
BZCOMMENTOR::Alan Meyer
BZCOMMENT::1
I did some searching on Bach.
I was confused for a while by the fact that lots of documents
link to terms that have a Semantic Type of "Drug/Agent category".
However, it turns out that almost all of them are Term documents
themselves with the same Semantic Type that are narrower terms
for other Terms with that Semantic Type. All of them will be
blocked by this global change.
If I've searched the data correctly, it appears that there are
only two documents on Bach (and Mahler) that need fixing before
we are ready to run the requested global change.
They are:
CDR0000369994
A blocked protocol with an InterventionType link to
CDR0000040376, "immuno-therapeutic agent"
and
CDR0000256166
The heavily used Term "Drug/Agent" that has "Drug/Agent
category" as a ParentTerm/TermId.
Although 369994 is blocked, it seems to me a good idea to fix
it
anyway even if we never plan to unblock it. It takes very little
time to do and will leave the database clean.
Since I'm testing on Mahler, it would provide the best test if
we
fix them on both Mahler and Bach.
I'll start working on the global change, but won't run anything
until those two docs are fixed.
BZDATETIME::2010-06-10 15:49:22
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::2
(In reply to comment #1)
> Since I'm testing on Mahler, it would provide the best test if
we
> fix them on both Mahler and Bach.
>
They have been fixed on Bach and Mahler. Please, let me know if anything was missed.
BZDATETIME::2010-06-18 00:13:40
BZCOMMENTOR::Alan Meyer
BZCOMMENT::3
I modified the ModifyDocs module to support blocking a document.
After working on another global change and seeing no new
problem
introduced by this change, I promoted it to Bach and Franck.
BZDATETIME::2010-06-24 23:00:24
BZCOMMENTOR::Alan Meyer
BZCOMMENT::4
I looked at some examples of obsolete terms in the database and
I'm not sure they were all handled the same way. So here are
some questions about relevant parts of the documents.
1. /Term/TermType/TermTypeName
This is a multiply occurring element. What should I put here
in the cases when:
a. There is an existing TermTypeName.
Should I:
1) Leave it alone?
2) Replace it with "Obsolete term"?
3) Keep what's there but prepend or append something, e.g.,
"Header term; Obsolete term"
4) Add another TermTypeName occurrence with the value
"Obsolete term"?
b. There is no existing TermTypeName.
1) Create one with the value "Obsolete term"?
2) Do nothing?
2. /Term/SemanticType.
This is another multiply occurring element. All of the
documents affected by this global will have at least one with
the attribute cdr:ref = "CDR0000256164" (the CDR ID of the
Term "Drug/Agent category".) There may be others.
Should I:
a. Delete the element with the 256164 attribute and leave any
others alone?
b. Should I leave all of them alone?
3. /Term/SemanticTypeText
At least some of the Terms of interest will have a
SemanticTypeText of "Drug/Agent category", but some may not.
It's an optional element. Some may have more than one.
I found one Term document with a SemanticTypeText value of:
"No SemanticTypeText-Obsolete"
The questions here are the same as for the others.
a. There are one or more existing SemanticTypeText elements.
1) Leave them alone?
2) Delete all of them?
3) Delete all of them and add a single one with the value
"No SemanticTypeText-Obsolete"?
4) Delete only the one with the value "Drug/Agent
category" and replace it.
5) Delete only the one with the value "Drug/Agent
category" and NOT replace it.
b. There are no existing SemanticTypeText elements.
1) Do nothing?
2) Add a new one with "No SemanticTypeText-Obsolete"?
4. /Term/TermStatus
There are four possible values of this element:
"Unreviewed"
"Reviewed-retain"
"Reviewed-offline"
"Reviewed-problematic"
Of those, three are actually used. "Reviewed-offline" isn't
found on Mahler.
Should this element be changed at all or left alone?
I'm not sure I've identified all of the places that we might or
might not want to modify in this global change. If anyone knows
of any others, please let me know.
I'm going to suspend work on this task until we have definitive
specifications for what to do on all of the above four elements.
BZDATETIME::2010-07-22 20:05:51
BZCOMMENTOR::Alan Meyer
BZCOMMENT::5
(In reply to comment #4)
> ...
> I'm going to suspend work on this task until we have
definitive
> specifications for what to do on all of the above four
elements.
We're slipping through the cracks here. Maybe someone with more
time available than Lakshmi can take a look at this?
BZDATETIME::2010-07-23 12:21:51
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::6
1. /Term/TermType/TermTypeName
This is a multiply occurring element. What should I put here
in the cases when:
a. There is an existing TermTypeName.
Add another TermTypeName occurrence with the value "Obsolete term".
BZDATETIME::2010-07-23 12:25:05
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::7
b. There is no existing TermTypeName - don't think there are any records like this. If there are, Create one with the value "Obsolete term"
2. Do nothing with SemanticType or SemanticType text. No change is needed here.
Everything else can stay the same.
Please check with Volker to double check that terms with Obsolete term value in term type are not published. If they are being picked up for publication, let us also block the terms in this global.
BZDATETIME::2010-07-23 12:34:57
BZCOMMENTOR::Volker Englisch
BZCOMMENT::8
(In reply to comment #7)
> Please check with Volker to double check that terms with Obsolete
term value in
> term type are not published.
We are publishing everything that's active with a TermStatus of
'Reviewed-Problematic'
'Reviewed-Retain'
'Unreviewed'
There are no other restrictions (except for the time stamp of the versioned document which must be before the time the publishing job started).
BZDATETIME::2010-07-23 12:46:26
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::9
In that case rather than blocking, we should update the TermStatus to Reviewed-offline. That should keep the terms from being published.
BZDATETIME::2010-08-03 21:43:08
BZCOMMENTOR::Alan Meyer
BZCOMMENT::10
(In reply to comment #8)
... From Volker:
> We are publishing everything that's active with a TermStatus
of
> 'Reviewed-Problematic'
> 'Reviewed-Retain'
> 'Unreviewed'
... From Lakshmi
(In reply to comment #9)
> In that case rather than blocking, we should update the TermStatus
to
> Reviewed-offline. That should keep the terms from being
published.
When Volker said "everything that's active", he meant that blocked documents (i.e., inactive documents) will NOT be published, regardless of TermStatus. We never publish blocked documents of any type, for any reason. So if we block them, as per the original specification, it won't be necessary to modify the TermStatus to keep them from being published.
I presume that means that the original plan of leaving TermStatus alone and blocking the documents is still the right one. Modifying the TermStatus would introduce a questionable value into the documents in order to achieve what blocking will do more directly.
BZDATETIME::2010-08-03 23:48:35
BZCOMMENTOR::Alan Meyer
BZCOMMENT::11
After re-reading all of the comments in this issue, it seems to
me that we want the global change to do the following things.
I'd like someone to confirm that or correct any errors:
1. Select all Term documents with a SemanticType link to
CDR0000256164, the document whose PreferredName is
"Drug/agent category".
2. Make NO changes to any existing information in the selected
documents.
Whatever information is currently there, including the
SemanticType, the TermTypeName, and the TermStatus elements,
will be unchanged.
3. Add a new TermTypeName to each selected document with the
value "Obsolete term". This will be appended after the
existing TermTypeNames, if there are any, in the document.
When the document is stored, a new document title should be
generated with the additional "Obsolete term" appended, for
example:
"alkylsulfonate;Header term;Obsolete term;"
4. Store a current working document for each selected Term
document with these modifications, and with the active_status
= "I" (standing for "Inactive"). All versions of the
documents will then be blocked from publishing.
5. Do nothing to the last version and last publishable version
of the selected documents.
I haven't thought of a good reason to modify the last
version or the last publishable version. It seems
inconsistent to modify them and then block them.
In other circumstances I have sometimes proposed that we
modify blocked documents so that, if they are ever unblocked,
they will be valid. However in this case it seems odd to
specifically mark a document as obsolete so that if it ever
is unblocked it will be, what, obsolete?
I've written various parts of the program but will wait for
confirmation of the above five points before I put it all
together and run it in test mode.
BZDATETIME::2010-08-04 17:34:22
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::12
Agree with what you outline below. Please go ahead.
(In reply to comment #11)
> After re-reading all of the comments in this issue, it seems
to
> me that we want the global change to do the following things.
> I'd like someone to confirm that or correct any errors:
>
> 1. Select all Term documents with a SemanticType link to
> CDR0000256164, the document whose PreferredName is
> "Drug/agent category".
>
> 2. Make NO changes to any existing information in the
selected
> documents.
>
> Whatever information is currently there, including the
> SemanticType, the TermTypeName, and the TermStatus elements,
> will be unchanged.
>
> 3. Add a new TermTypeName to each selected document with the
> value "Obsolete term". This will be appended after the
> existing TermTypeNames, if there are any, in the document.
>
> When the document is stored, a new document title should be
> generated with the additional "Obsolete term" appended, for
> example:
>
> "alkylsulfonate;Header term;Obsolete term;"
>
> 4. Store a current working document for each selected Term
> document with these modifications, and with the active_status
> = "I" (standing for "Inactive"). All versions of the
> documents will then be blocked from publishing.
>
> 5. Do nothing to the last version and last publishable
version
> of the selected documents.
>
> I haven't thought of a good reason to modify the last
> version or the last publishable version. It seems
> inconsistent to modify them and then block them.
>
> In other circumstances I have sometimes proposed that we
> modify blocked documents so that, if they are ever unblocked,
> they will be valid. However in this case it seems odd to
> specifically mark a document as obsolete so that if it ever
> is unblocked it will be, what, obsolete?
>
> I've written various parts of the program but will wait for
> confirmation of the above five points before I put it all
> together and run it in test mode.
BZDATETIME::2010-08-05 15:40:50
BZCOMMENTOR::Alan Meyer
BZCOMMENT::13
I've completed the programming for this task and run in test
mode
on Mahler. Results are in:
http://mahler.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2010-08-05_15-24-32
104 documents were processed. There were no errors or warnings,
so I'm not bothering to attach the logfile.
I compared the 104 diffs programmatically and they're all the
same. So I think that looking at a few docs should be enough to
verify whether the program is working as expected.
The diffs use the regular "diff" utility which produces an odd
appearing output like this:
+ </TermTypeName>
+ <TermTypeName>
+ Obsolete term
In context it looks like this:
<TermType>
<TermTypeName>
Header term
+ </TermTypeName>
+ <TermTypeName>
+ Obsolete term
</TermTypeName>
</TermType>
which, I think, is what we wanted.
Test mode testing only tests the selection of documents and
their
transformation. To find out whether the blocking works and the
titles are transformed to append "Obsolete term;" to the end
won't be tested until I run in live mode on Mahler. I'll wait
for a go ahead before doing that.
BZDATETIME::2010-08-24 13:19:45
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::14
I think it is fine to run this in live mode on Mahler.
BZDATETIME::2010-08-24 15:01:18
BZCOMMENTOR::Alan Meyer
BZCOMMENT::15
The live run is complete. The log file is attached.
There was one validation error that took me a while to figure
out:
Warning for CDR0000040388: Failed link target rule:
((/Term/TermType/TermTypeName=="Index term") or
(/Term/TermType/TermTypeName=="Header term") or
(/Term/TermType/TermTypeName=="Semantic type"))
It looks like the problem here is in the link:
/Term/TermRelations/ParentTerm/TermID/@cdr:ref='CDR0000040387'
This establishes that the parent of "extended spectrum
penicillin"
is "penicillin". But 40387/penicillin is not an index term, not a
header term, and not a semantic type.
The validation error existed before the global change and still
exists after.
Attachment Request4860.log has been added with description: Log file for live run on Mahler
BZDATETIME::2010-08-24 15:01:55
BZCOMMENTOR::Alan Meyer
BZCOMMENT::16
I think we can run this on Bach whenever we wish.
BZDATETIME::2010-08-30 11:25:02
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::17
Let's run it in a test mode on Bach and I will ask Mary Barnstead to take a look at it.
BZDATETIME::2010-08-30 12:36:42
BZCOMMENTOR::Alan Meyer
BZCOMMENT::18
(In reply to comment #17)
> Let's run it in a test mode on Bach and I will ask Mary Barnstead
to take a
> look at it.
I will do some work on OCECDR-3209 today to see if I can ensure
that
the error reporting is working properly in the global change
central module, then run this global on Bach before I leave tonight.
My hope is that, if global change validation isn't working
right,
I can fix it in time for this run, which will make the run more
useful.
But whether I get that finished or not, I'll still run this
global
tonight.
BZDATETIME::2010-08-30 20:50:02
BZCOMMENTOR::Alan Meyer
BZCOMMENT::19
I've run the test mode on Bach. Results are in:
http://bach.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2010-08-30_20-36-22
See comment #13 for interpretation of the diffs.
The log file has no warnings or errors, so I have not attached
it.
BZDATETIME::2010-09-01 07:09:56
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::20
(In reply to comment #19)
> I've run the test mode on Bach. Results are in:
>
> http://bach.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2010-08-30_20-36-22
>
> See comment #13 for interpretation of the diffs.
>
> The log file has no warnings or errors, so I have not
attached
> it.
The test results look fine to me. The global appears to be doing exactly what you outlined in comment #11. I will ask Mary to also take a look and post a comment.
BZDATETIME::2010-09-01 09:47:59
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::21
(In reply to comment #20)
> (In reply to comment #19)
>
> The test results look fine to me. The global appears to be doing
exactly what
> you outlined in comment #11. I will ask Mary to also take a look
and post a
> comment.
Mary also looked at the test results and she said they are okay so I believe you can run the global in live mode on Bach.
BZDATETIME::2010-09-02 10:28:20
BZCOMMENTOR::Alan Meyer
BZCOMMENT::22
The live run is complete. There were no surprises.
The log file is attached.
Attachment Request4860.log has been added with description: Log file from live run on Bach
BZDATETIME::2010-09-02 10:28:58
BZCOMMENTOR::Alan Meyer
BZCOMMENT::23
All done but final QA.
BZDATETIME::2010-09-09 11:49:35
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::24
(In reply to comment #23)
> All done but final QA.
QA is completed. We did not see any problems.
Closing issue. Thanks!
File Name | Posted | User |
---|---|---|
Request4860.log | 2010-09-02 10:28:20 | |
Request4860.log | 2010-08-24 15:01:18 |
Elapsed: 0:00:00.000876