Issue Number | 3102 |
---|---|
Summary | Terminology vendor filter changes |
Created | 2010-03-03 16:31:39 |
Issue Type | Improvement |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2010-06-21 14:06:20 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107430 |
BZISSUE::4778
BZDATETIME::2010-03-03 16:31:39
BZCREATOR::William Osei-Poku
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku
When multiple OtherNameType values are selected within a single OtherName block in a Term document, the second value (of the OtherNameType) does not display on cancer.gov. We work around this issue by adding another block of OtherName and select a second OtherNameType value. Example is 442270.
Please modify the vendor filter so that when multiple OtherNameType values are selected within a single OtherName block, they will all display.
BZDATETIME::2010-03-11 18:35:55
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1
The following filter has been modified:
CDR000134 - Vendor Filter: Term
This is ready for review on MAHLER.
BZDATETIME::2010-03-12 14:00:14
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::2
(In reply to comment #1)
> The following filter has been modified:
> CDR000134 - Vendor Filter: Term
>
> This is ready for review on MAHLER.
How do you suggest that I test this one since there is not publish preview for the term document? I filtered the document on Mahler and compared it with the filtered document on Bach and I did not seem to have seen any difference?
BZDATETIME::2010-03-12 14:07:37
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3
You could run the vendor filter for terms and look at the XML. The term documents aren't too complicated to compare the XML output.
Let me know if you need help with this.
BZDATETIME::2010-03-12 14:17:39
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::4
(In reply to comment #3)
> You could run the vendor filter for terms and look at the XML. The
term
> documents aren't too complicated to compare the XML output.
>
> Let me know if you need help with this.
I did and saw that the US brand name and the foreign brand name appear on different rows in the output so I am assuming it is OK now.
BZDATETIME::2010-03-12 14:24:01
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::5
(In reply to comment #4)
> (In reply to comment #3)
> > You could run the vendor filter for terms and look at the XML.
The term
> > documents aren't too complicated to compare the XML
output.
> >
> > Let me know if you need help with this.
>
> I did and saw that the US brand name and the foreign brand name
appear on
> different rows in the output so I am assuming it is OK now.
Verified on Mahler. Please promote to Bach.
BZDATETIME::2010-03-12 15:10:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6
The following filter has been modified:
CDR000134 - R9523: Vendor Filter: Term
I will need to run a diff report on FRANCK before promoting to production.
BZDATETIME::2010-03-12 16:00:47
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7
I ran the diff report and identified a change for 72 of the term
documents to be published. At least for some of these terms a
work-around has already been used to make the second term type appear in
the vendor output. I am guessing that these terms will be displayed
multiple times on Cancer.gov if the document doesn't get cleaned up
before we're moving the changed vendor filter in production.
A sample of such a term that needs to be updated is CDR299061 -
sunitinib malate.
The term name SU11248 will be displayed twice for the NameType of 'Code
name'.
How would you like to proceed? Do we want to clean-up the data after we published or before?
BZDATETIME::2010-03-12 16:05:41
BZCOMMENTOR::Volker Englisch
BZCOMMENT::8
Attachment term_ids.txt has been added with description: CDR-IDs for terms with multiple OtherNameType entries.
BZDATETIME::2010-03-12 16:58:59
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::9
(In reply to comment #7)
> I ran the diff report and identified a change for 72 of the term
documents to
> be published. At least for some of these terms a work-around has
already been
> used to make the second term type appear in the vendor output. I am
guessing
> that these terms will be displayed multiple times on Cancer.gov if
the document
> doesn't get cleaned up before we're moving the changed vendor
filter in
> production.
> A sample of such a term that needs to be updated is CDR299061 -
sunitinib
> malate.
> The term name SU11248 will be displayed twice for the NameType of
'Code name'.
> How would you like to proceed? Do we want to clean-up the data
after we
> published or before?
We will fix the terms before you move the vendor filter into production. I will let you know when it is completed. Thank you!
BZDATETIME::2010-04-01 11:54:41
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::10
(In reply to comment #9)
> (In reply to comment #7)
>
> We will fix the terms before you move the vendor filter into
production. I will
> let you know when it is completed. Thank you!
It will take a while to finish the cleanup. I will post a comment when the cleanup is finished.
BZDATETIME::2010-05-28 14:52:05
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::11
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #7)
>
> >
> > We will fix the terms before you move the vendor filter into
production. I will
> > let you know when it is completed. Thank you!
>
>
> It will take a while to finish the cleanup. I will post a comment
when the
> cleanup is finished.
The cleanup is done.
Please promote to Bach. Can you do another diff report after the change on Bach? I think that will be helpful.
BZDATETIME::2010-05-28 14:55:46
BZCOMMENTOR::Volker Englisch
BZCOMMENT::12
How about we run a diff on FRANCK before we move this to
production?
We refreshed FRANCK on Wednesday, so I would guess that possibly most of
the updates were already included unless the updated were done mostly
during the last two days.
BZDATETIME::2010-05-28 14:59:19
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::13
(In reply to comment #12)
> How about we run a diff on FRANCK before we move this to
production?
> We refreshed FRANCK on Wednesday, so I would guess that possibly
most of the
> updates were already included unless the updated were done mostly
during the
> last two days.
A few of the changes were done yesterday and today.
BZDATETIME::2010-05-28 19:15:07
BZCOMMENTOR::Volker Englisch
BZCOMMENT::14
I ran a report identifying those Term documents that are still listing two OtherName blocks for the same OtherTermName. There are still 189 or these.
Unfortunately, I'm not certain how I had selected the list earlier. I probably searched the vendor output but I'm not sure.
I am putting the SQL query in the query interface with the name
Terms with duplicate OtherName block
to identify these terms with multiple OtherName blocks.
Please let me know if you'd like to proceed with putting the filter in production or if you would first want to clean up these other terms as well.
BZDATETIME::2010-06-03 09:42:12
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::15
(In reply to comment #14)
> I ran a report identifying those Term documents that are still
listing two
> OtherName blocks for the same OtherTermName. There are still 189 or
these.
>
> Unfortunately, I'm not certain how I had selected the list earlier.
I probably
> searched the vendor output but I'm not sure.
>
> I am putting the SQL query in the query interface with the
name
> Terms with duplicate OtherName block
> to identify these terms with multiple OtherName blocks.
>
> Please let me know if you'd like to proceed with putting the filter
in
> production or if you would first want to clean up these other terms
as well.
We are reviewing the terms. A lot of the terms have duplicate OtherName blocks because of the data imported from the NCI Thesaurus. These will continue to have duplicate OtherName blocks because we will always want to keep them separate. For example CDR38294 (entinostat), there are two blocks with the <OtherName> text value of 'MS 275'. One is from data imported from the NCI Thesaurus and the other is the Code name possibly entered by users.
Would the changes you have made to the vendor filters cause duplicate OtherName blocks with the same <OtherName> value (as in the example I described above) to display as duplicate on cancer.gov? I checked on cancer.gov and it is currently not displaying any duplicate information.
BZDATETIME::2010-06-03 11:25:29
BZCOMMENTOR::Volker Englisch
BZCOMMENT::16
(In reply to comment #15)
> there are two blocks with the <OtherName> text value of 'MS
275'.
I believe you are referring to the <OtherName> blocks with the value 'MS-275'.
> Would the changes you have made to the vendor filters cause
duplicate
> OtherName blocks with the same <OtherName> value (as in the
example I
> described above) to display as duplicate on cancer.gov?
I can not answer this question with certainty because not all of the data the vendor output contains for a term is displayed on Cancer.gov. It appears that lexical variants, for instance, are not listed at all. One of those 'MS-275' blocks you've mentioned for the example above is marked as lexical variant, so it probably wouldn't be displayed twice anyway.
... 10 minutes later ...
I tested on MAHLER: If you do have two blocks with the same name and
they are both of the type of 'Code name' they do cause duplicates on
Cancer.gov.
As a sample please see
http://wwwgk.cancer.gov/drugdictionary/?CdrID=38294
BZDATETIME::2010-06-03 11:33:08
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::17
(In reply to comment #16)
> (In reply to comment #15)
> > there are two blocks with the <OtherName> text value of
'MS 275'.
>
> I believe you are referring to the <OtherName> blocks with
the value 'MS-275'.
>
Yes. Sorry about the confusion.
> > Would the changes you have made to the vendor filters cause
duplicate
> > OtherName blocks with the same <OtherName> value (as in
the example I
> > described above) to display as duplicate on cancer.gov?
>
> I can not answer this question with certainty because not all of
the data the
> vendor output contains for a term is displayed on Cancer.gov. It
appears that
> lexical variants, for instance, are not listed at all. One of those
'MS-275'
> blocks you've mentioned for the example above is marked as lexical
variant, so
> it probably wouldn't be displayed twice anyway.
>
> ... 10 minutes later ...
>
> I tested on MAHLER: If you do have two blocks with the same name
and they are
> both of the type of 'Code name' they do cause duplicates on
Cancer.gov.
> As a sample please see
> http://wwwgk.cancer.gov/drugdictionary/?CdrID=38294
Thanks. This clarifies it. In the case of 'Code name' we will clean all that up.
(In reply to comment #14)
> Please let me know if you'd like to proceed with putting the
filter in
> production or if you would first want to clean up these other terms
as well.
I will let you know when these are done before you put the changes into production.
BZDATETIME::2010-06-03 11:42:58
BZCOMMENTOR::Volker Englisch
BZCOMMENT::18
(In reply to comment #17)
> In the case of 'Code name' we will clean all that up.
Just to be clear: I only tested the situation for a duplicate Code Name. I'm sure the same will also happen for a duplicate Synonym and ChemicalStructureName. Although I believe it's unlikely to have two blocks listed with the same StructureName.
Let me know if you'd like me to run a test for the Synonyms as well.
BZDATETIME::2010-06-15 11:05:03
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::19
We have completed the cleanup. Please put this change into production.
BZDATETIME::2010-06-16 12:35:37
BZCOMMENTOR::Volker Englisch
BZCOMMENT::20
The following filter has been copied to FRANCK and BACH:
CDR000134 - R9523: Vendor Filter: Term
Since we're publishing the terminology data nightly we will be able to verify the data on Cancer.gov tomorrow.
BZDATETIME::2010-06-18 16:54:15
BZCOMMENTOR::Volker Englisch
BZCOMMENT::21
It appears to me that everything looks good on Cancer.gov.
William, if you agree please close this issue.
BZDATETIME::2010-06-21 14:06:20
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::22
(In reply to comment #21)
> It appears to me that everything looks good on Cancer.gov.
> William, if you agree please close this issue.
Yes. It looks good in the dictionary on cancer.gov. Thanks!
I have closed this issue.
File Name | Posted | User |
---|---|---|
term_ids.txt | 2010-03-12 16:05:41 | Englisch, Volker (NIH/NCI) [C] |
Elapsed: 0:00:00.001690