Issue Number | 2696 |
---|---|
Summary | Add CTGovInterventionType to CTGov exports |
Created | 2008-11-06 15:47:44 |
Issue Type | Improvement |
Submitted By | alan |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2009-02-17 20:21:51 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107024 |
BZISSUE::4367
BZDATETIME::2008-11-06 15:47:44
BZCREATOR::Alan Meyer
BZASSIGNEE::Bob Kline
BZQACONTACT::Lakshmi Grama
When the global change to add CTGovInterventionType elements
to Term documents is completed, the new element should be
included in protocols sent to NLM.
BZDATETIME::2008-11-13 10:56:48
BZCOMMENTOR::Bob Kline
BZCOMMENT::1
Lakshmi:
Any instructions on how to do this?
BZDATETIME::2008-11-13 13:19:17
BZCOMMENTOR::Bob Kline
BZCOMMENT::2
Next step is for Bob to post a description of what we currently do for intervention, then Lakshmi will provide instructions for how this will be modified.
BZDATETIME::2008-11-14 10:43:22
BZCOMMENTOR::Bob Kline
BZCOMMENT::3
Here's what the documentation for the Intervention mapping class says:
Transformation to convert all of the original document's
Intervention elements to the structure specified by NLM's
DTD, based on the semantic types of the terms found in the
Intervention elements. Implemented as a post-process since
the XSL/T filter which is fed the vendor XML document must
retrieve the semantic type information from the CDR, as it
is not exported with the vendor document.
For mapping logic, see mapping.xls (Lakshmi Grama, 2002-12-12,
revised 2002-12-13), attached to issue #1892 with comment #42.
Intervention type parents suppressed 2007-10-24 at Lakshmi's
request. Further revision of the mapping logic posted by Lakshmi
2008-05-18 with comment #43 of issue #4076.
Here's a link to the most recent version of that mapping logic:
BZDATETIME::2008-11-25 09:42:19
BZCOMMENTOR::Bob Kline
BZCOMMENT::4
Lakshmi and I met this morning to go over the revised logic for exporting intervention information to CT.gov.
LET PROTOCOL-DOC = VENDOR IN-SCOPE-PROTOCOL DOCUMENT
LET AOG-BLOCKS = ARRAY OF ARM-OR-GROUP ELEMENT BLOCKS OF
PROTOCOL-DOC
FOR EACH INTERVENTION ELEMENT IN PROTOCOL-DOC:
LET I-DESC = VALUE OF INTERVENTION-DESCRIPTION ELEMENT
LET TYPE-DOC = TERM DOCUMENT LINKED BY INTERVENTION-TYPE ELEMENT
LET CTG-TYPE = CTGOV-INTERVENTION-TYPE FROM TYPE-DOC
LET AOG-LINKS = ARRAY OF VALUES FROM ARM-OR-GROUP-LINK ELEMENTS
FOR EACH AOG-LINK IN AOG-LINKS:
IF AOG-LINK NOT FOUND AS LABEL IN ANY MEMBER OF AOG-BLOCKS:
RAISE EXCEPTION, FAILING EXPORT OF DOCUMENT
IF AOG-BLOCK IS EMPTY OR AOG-LINKS IS NOT EMPTY:
FOR EACH INTERVENTION-NAME-LINK ELEMENT:
LET LINKED-TERM-DOC = VENDOR DOCUMENT TARGET OF ELEMENT LINK
LET S-TYPES = ARRAY OF SEMANTIC TYPES FROM LINKED-TERM-DOC
IF NONE OF S-TYPES IS 'DRUG/AGENT COMBINATION':
LET OUTPUT-NAME = PREFERRED-NAME FROM LINKED-TERM-DOC
CREATE NEW INTERVENTION BLOCK IN OUTPUT DOCUMENT
ADD INTERVENTION-TYPE CHILD TO BLOCK WITH VALUE FROM CTG-TYPE
ADD INTERVENTION-NAME CHILD TO BLOCK WITH VALUE FROM OUTPUT-NAME
IF AOG-BLOCKS IS NOT EMPTY:
IF I-DESC IS MISSING:
RAISE EXCEPTION
ADD INTERVENTION-DESCRIPTION CHILD TO BLOCK WITH VALUE FROM I-DESC
FOR EACH AOG-LINK IN AOG-LINKS:
ADD ARM-GROUP-LABEL CHILD TO BLOCK WITH VALUE OF AOG-LINK
IF NO INTERVENTION-NAME-LINK ELEMENTS ARE PRESENT:
LET OUTPUT-NAME = PREFERRED-NAME FROM TYPE-DOC
CREATE NEW INTERVENTION BLOCK IN OUTPUT DOCUMENT
ADD INTERVENTION-TYPE CHILD TO BLOCK WITH VALUE FROM CTG-TYPE
ADD INTERVENTION-NAME CHILD TO BLOCK WITH VALUE FROM OUTPUT-NAME
IF AOG-BLOCKS IS NOT EMPTY:
IF I-DESC IS MISSING:
RAISE EXCEPTION
ADD INTERVENTION-DESCRIPTION CHILD TO BLOCK WITH VALUE FROM I-DESC
FOR EACH AOG-LINK IN AOG-LINKS:
ADD ARM-GROUP-LABEL CHILD TO BLOCK WITH VALUE OF AOG-LINK
This logic will replace the mapping spreadsheets cited in the previous comment. Please let me know if I have captured what we said accurately, Lakshmi.
We also discussed a report which will contain the following columns:
Name of Term document linked by InterventionType element
Name of Term document linked by InterventionNameLink
element
("None" for Intervention blocks without any InterventionNameLink
children)
Mapped value for intervention_type element in output document
Mapped value for intervention_name element in output document
Count of unique occurrences for this combination of values in the
other
four columns
I will capture the raw data so that if the report shows anomalies in some of the mapping values Lakshmi can be provided with the list of InScopeProtocol documents for which the suspect mappings occurred.
BZDATETIME::2008-11-25 10:24:35
BZCOMMENTOR::Bob Kline
BZCOMMENT::5
Here's a revision of the logic, consolidating some duplicated information):
LET PROTOCOL-DOC = VENDOR IN-SCOPE-PROTOCOL DOCUMENT
LET AOG-BLOCKS = ARRAY OF ARM-OR-GROUP ELEMENT BLOCKS OF
PROTOCOL-DOC
FOR EACH INTERVENTION ELEMENT IN PROTOCOL-DOC:
LET I-DESC = VALUE OF INTERVENTION-DESCRIPTION CHILD ELEMENT
LET TYPE-DOC = TERM DOCUMENT LINKED BY INTERVENTION-TYPE CHILD
ELEMENT
LET AOG-LINKS = ARRAY OF VALUES FROM ARM-OR-GROUP-LINK CHILD
ELEMENTS
LET CTG-TYPE = CTGOV-INTERVENTION-TYPE FROM TYPE-DOC
FOR EACH AOG-LINK IN AOG-LINKS:
IF AOG-LINK NOT FOUND AS LABEL IN ANY MEMBER OF AOG-BLOCKS:
RAISE EXCEPTION, FAILING EXPORT OF DOCUMENT
IF AOG-BLOCK IS EMPTY OR AOG-LINKS IS NOT EMPTY:
IF AOG-BLOCKS IS NOT EMPTY AND I-DESC IS MISSING:
RAISE EXCEPTION, FAILING EXPORT OF DOCUMENT
FOR EACH INTERVENTION-NAME-LINK ELEMENT E:
LET LINKED-TERM-DOC = VENDOR DOCUMENT TARGET OF E
LET S-TYPES = ARRAY OF SEMANTIC TYPES FROM LINKED-TERM-DOC
IF NONE OF S-TYPES IS 'DRUG/AGENT COMBINATION':
LET OUTPUT-NAME = PREFERRED-NAME FROM LINKED-TERM-DOC
CREATE NEW INTERVENTION BLOCK IN OUTPUT DOCUMENT
ADD INTERVENTION-TYPE CHILD TO BLOCK WITH VALUE FROM CTG-TYPE
ADD INTERVENTION-NAME CHILD TO BLOCK WITH VALUE FROM OUTPUT-NAME
IF AOG-BLOCKS IS NOT EMPTY:
ADD INTERVENTION-DESCRIPTION CHILD TO BLOCK WITH VALUE FROM I-DESC
FOR EACH AOG-LINK IN AOG-LINKS:
ADD ARM-GROUP-LABEL CHILD TO BLOCK WITH VALUE OF AOG-LINK
IF NO INTERVENTION-NAME-LINK ELEMENTS ARE PRESENT:
LET OUTPUT-NAME = PREFERRED-NAME FROM TYPE-DOC
CREATE NEW INTERVENTION BLOCK IN OUTPUT DOCUMENT
ADD INTERVENTION-TYPE CHILD TO BLOCK WITH VALUE FROM CTG-TYPE
ADD INTERVENTION-NAME CHILD TO BLOCK WITH VALUE FROM OUTPUT-NAME
IF AOG-BLOCKS IS NOT EMPTY:
ADD INTERVENTION-DESCRIPTION CHILD TO BLOCK WITH VALUE FROM I-DESC
FOR EACH AOG-LINK IN AOG-LINKS:
ADD ARM-GROUP-LABEL CHILD TO BLOCK WITH VALUE OF AOG-LINK
REMOVE DUPLICATE INTERVENTION BLOCKS IN OUTPUT (IGNORING I-DESC
DIFFERENCES)
BZDATETIME::2008-11-28 18:20:14
BZCOMMENTOR::Bob Kline
BZCOMMENT::6
Implemented on Mahler. Here's the report described in comment #4:
http://mahler.nci.nih.gov/InterventionMappings-20081128164537.html
and here are the results of a test run on Mahler:
http://mahler.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=20081128160210
BZDATETIME::2008-12-01 08:36:02
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::7
I would like Mary and Doug to review the first report, particularly those that are marked as Biologic/vaccine. I think some of these may need to be tagged as drug. I am also looking at the list and will send the spreadsheet with my questions to Mary.
BZDATETIME::2008-12-01 09:04:53
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::8
Need CDR ID for this combination in report
cardiotoxicity attenuation None Drug cardiotoxicity attenuation
BZDATETIME::2008-12-01 10:18:56
BZCOMMENTOR::Bob Kline
BZCOMMENT::9
(In reply to comment #8)
> Need CDR ID for this combination in report
> cardiotoxicity attenuation None Drug cardiotoxicity
attenuation
>
CDR601334
BZDATETIME::2008-12-01 11:23:25
BZCOMMENTOR::Bob Kline
BZCOMMENT::10
Here's a version of the mapping report which allows you to see which documents were involved in any particular mapping combination by just clicking on the Count column for any row in the table, which will display the CDR IDs for the documents in which the mapping represented by that row was performed.
http://mahler.nci.nih.gov/cgi-bin/cdr/InterventionMappings.py
BZDATETIME::2008-12-01 16:01:37
BZCOMMENTOR::Bob Kline
BZCOMMENT::11
Here's what the results on Bach look like:
BZDATETIME::2008-12-01 17:25:10
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::12
(In reply to comment #10)
> Here's a version of the mapping report which allows you to see
which documents
> were involved in any particular mapping combination by just
clicking on the
> Count column for any row in the table, which will display the CDR
IDs for the
> documents in which the mapping represented by that row was
performed.
> http://mahler.nci.nih.gov/cgi-bin/cdr/InterventionMappings.py
It was a little hard to see this because I had to keep going to the top of the page to see the Ids and I kept losing my place. Could the IDs just show in another window?
BZDATETIME::2008-12-02 10:00:31
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::13
Bob and I discussed possible modifications to see if we could resolve some of the problems with mappings for terms that have a semantic type of drug/agent.
Also, in the context of CTGOV, we should only export an InterventionName as part of a single "intervention_type, intervention_name" pair. CTGOV is looking at intervention type as the inherent property of the intervention rather than its use as a modality or method of action (for drugs). In this paradigm, it is difficult to think of the same substance showing up in the same record as a Drug and a Dietary Supplement - which could possibly happen.
To avoid this, I am recommending that in CTGOV export, we only allow
an intervention name to appear in one
intervention_type/intervention_name pair. In cases where there are
multiple possibilities, I recommend that we use a precedence table
-
1. Drug
2. Biologic/Vaccine
3. Dietary Supplement.
This does not address issues where the dietary supplement such as White button mushroom extract is paired with a intervention type of aromatase inhibition therapy - it will be mapped to drug since armomatase inhibition therapy is mapped to Drug.
But it would be worth trying to see what, if any, really problematic instances are identified with this tweak to the logic.
BZDATETIME::2008-12-02 14:32:36
BZCOMMENTOR::Bob Kline
BZCOMMENT::14
(In reply to comment #12)
> It was a little hard to see this because I had to keep going to
the top of the
> page to see the Ids and I kept losing my place. Could the IDs just
show in
> another window?
Done. And here's the URL for the mappings using the modified logic based on the hierarchy of CT.gov intervention type values:
http://bach.nci.nih.gov/cgi-bin/cdr/InterventionMappings.py?suffix=20081202124117
BZDATETIME::2008-12-02 14:41:32
BZCOMMENTOR::Bob Kline
BZCOMMENT::15
This is an excerpt from the export logs for failures caused when the presence of multiple occurrences of the same intervention name could not be eliminated by using the new hard-wired intervention type precedence table.
Attachment multiple-types.log has been added with description: List of failures caused by duplicate intervention names
BZDATETIME::2008-12-02 14:51:16
BZCOMMENTOR::Bob Kline
BZCOMMENT::16
(In reply to comment #15)
> Created an attachment (id=1588) [details]
> List of failures caused by duplicate intervention names
>
> This is an excerpt from the export logs for failures caused when
the presence
> of multiple occurrences of the same intervention name could not be
eliminated
> by using the new hard-wired intervention type precedence
table.
>
Looks like most (though not all) of the lines in this list of failures were caused by a typo in comment 13. I'm going to change "Biologic/Vaccine" to "Biological/Vaccine" and run a fresh test export job.
BZDATETIME::2008-12-02 15:51:59
BZCOMMENTOR::Bob Kline
BZCOMMENT::17
(In reply to comment #16)
> Looks like most (though not all) of the lines in this list of
failures were
> caused by a typo in comment 13. I'm going to change
"Biologic/Vaccine" to
> "Biological/Vaccine" and run a fresh test export job.
>
Here are the mappings from this latest run:
http://bach.nci.nih.gov/cgi-bin/cdr/InterventionMappings.py?suffix=20081202152603
BZDATETIME::2008-12-02 15:58:15
BZCOMMENTOR::Bob Kline
BZCOMMENT::18
Attachment failures.log has been added with description: Problems encountered during test export job.
BZDATETIME::2008-12-04 13:02:45
BZCOMMENTOR::Bob Kline
BZCOMMENT::19
Increased priority at status meeting.
BZDATETIME::2008-12-09 11:04:01
BZCOMMENTOR::Bob Kline
BZCOMMENT::20
This is a report requested by Lakshmi off-line.
Attachment InterventionNameSemanticTypes.xls has been added with description: Semantic types for intervention names
BZDATETIME::2008-12-11 07:55:00
BZCOMMENTOR::Bob Kline
BZCOMMENT::21
Lakshmi:
Could you post a summary here of what the new mapping logic will be? Thanks!
BZDATETIME::2008-12-11 16:04:25
BZCOMMENTOR::Bob Kline
BZCOMMENT::22
Attachment InterventionNameSemanticTypes.xls has been added with description: Same spreadsheet, but with the "Drug/agent combination" lines deleted
BZDATETIME::2008-12-17 08:56:39
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::23
Bob and I discussed modifications to logic based on the additional mapping of drug/agent terms to CTGOVInterventionType. Issue needs to be updated with the revised logic.
BZDATETIME::2008-12-17 09:32:23
BZCOMMENTOR::Bob Kline
BZCOMMENT::24
The notes we were scribbling the other day for the latest logic for mapping intervention values are a bit cryptic. Here's what I have been able to decipher (with the help of some fuzzy memory):
for each Intervention element in the trial document:
find the InterventionType child of the Intervention element
find the document (IT) linked by that element
for each InterventionNameLink child:
find the document (INL) linked by that child element
get the preferred name (NAME) from that document
get the semantic types for the INL document
if any of these semantic types is 'Drug/agent combination':
do nothing
otherwise, if any of these semantic types is 'Drug/agent':
find the CTGovInterventionType value from the INL document
use it as intervention_type, with NAME as intervention_name
otherwise:
find the CTGovInterventionType value in the IT document
use it as intervention_type, with NAME as intervention_name
if the Intervention element has no InterventionNameLink children:
find the CTGovInterventionType value in the IT document
use it as intervention_type
use preferred name of IT document as intervention_name
I've elided logic for the checks for arms and intervention descriptions, which hasn't changed.
Does this look right to you?
BZDATETIME::2008-12-24 11:30:01
BZCOMMENTOR::Bob Kline
BZCOMMENT::25
(In reply to comment #24)
> ... Does this look right to you?
Lakshmi:
I'm holding off on the actual implementation until I get confirmation that you've looked over the logic and approve it.
BZDATETIME::2009-01-05 08:19:16
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::26
(In reply to comment #24)
I think the logic is correct. Please go ahead.
BZDATETIME::2009-01-05 11:06:13
BZCOMMENTOR::Bob Kline
BZCOMMENT::27
The code has been modified to reflect the new logic. As soon as the global change for issue #4414 has been run in live mode and I have confirmation that the results are correct I'll run a test export job on Bach.
Just to confirm: we still have the extra code to make sure only one intervention block goes out for a given intervention name, using the hard-wired hierarchy of intervention types (looking first for 'Drug' then for 'Biological/Vaccine' then for 'Dietary Supplement').
BZDATETIME::2009-01-05 14:39:27
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::28
>Just to confirm: we still have the extra code to make sure only
one
>intervention block goes out for a given intervention name, using the
hard->wired hierarchy of intervention types (looking first for 'Drug'
then for
>'Biological/Vaccine' then for 'Dietary Supplement').
Actually we may be able to do away with that code - I would like to see the results with and without this extra code so I can confirm.
BZDATETIME::2009-01-08 08:18:34
BZCOMMENTOR::Bob Kline
BZCOMMENT::29
[Writeup of high-level test plan requested by Lakshmi in another issue]
The tests for this change will be set up as follows:
[ ] Create test output set with old code (set A)
[ ] Create test output set with new code (set B)
[ ] Create test output set with part of new code [1] (set C)
[ ] Create diff report comparing set A with set B (Report D)
[ ] Create diff report comparing set B with set C (Report E)
[ ] CIAT (and possibly Lakshmi) reviews report D
[ ] Lakshmi reviews report E
[ ] Lakshmi and CIAT review sample documents from sets B and C
[ ] Lakshmi decides whether to keep code used to generate set B but not
set C
[ ] Lakshmi and CIAT decide whether the new code is ready for
production
[1] omitting the code to apply a hard-wired hierarchy of intervention
types to
ensure that no more than one intervention block is exported a
given
intervention name
BZDATETIME::2009-01-12 10:43:58
BZCOMMENTOR::Bob Kline
BZCOMMENT::30
William has closed issue #4414, so I have begun the first test run of the export software using the existing code ("set A" in the test plan above).
BZDATETIME::2009-01-12 14:50:47
BZCOMMENTOR::Bob Kline
BZCOMMENT::31
(In reply to comment #30)
> William has closed issue #4414, so I have begun the first test run
of the
> export software using the existing code ("set A" in the test plan
above).
I have generated all three sets, and the two diff reports. However, unless you object, I'm going to run the jobs to create the three sets again, with slight modifications to the intervention output code to make the intervention blocks come out in a predictable order, and then generate the diff reports again. Otherwise, I think you'll find that the diff reports will be difficult to read, as a lot of the diff output reflects reordering of the intervention blocks, rather than real changes to the output. Another approach I could take would be to write custom code to compare the intervention block sets between the different runs. The advantage of this second approach is that we don't have to tamper with the code we're testing. The disadvantages are that it will take a little longer to write the extra code, and it might suppress evidence of inadvertent changes to other parts of the documents (though I didn't notice any such changes in my cursory review of the first set of reports). Which approach would you prefer?
BZDATETIME::2009-01-12 16:35:37
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::32
>>However, unless you object, I'm going to run the jobs to create the three sets >>again, with slight modifications to the intervention output code to make the >>intervention blocks come out in a predictable order, and then generate the diff >>reports again.
Please go ahead and do this
BZDATETIME::2009-01-13 10:29:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::33
Here is the report comparing the first two sets:
http://bach.nci.nih.gov/issue4367-sets-a-and-b.html
There's still some distracting juggling that may be making it less straight-forward to review the results, even though I sorted the blocks to make sure they weren't coming out in random order. The sorting was by intervention type, and under that by intervention name. It may be that if the sort had been by intervention name first and then by intervention type it would be easier still to review the results. I'll run the tests again with that modification if you think it's needed.
The third set is still being generated.
BZDATETIME::2009-01-13 14:00:00
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::34
Noticed this in my review - it seems odd
diff -ru set-a/CDR256897.xml set-b/CDR256897.xml
set-a/CDR256897.xml Tue Jan 13 08:56:53 2009
+++ set-b/CDR256897.xml Tue Jan 13 09:34:43 2009
@@ -77,11 +77,11 @@
<condition>myelodysplastic syndromes</condition>
<condition>myelodysplastic/myeloproliferative
diseases</condition>
<intervention>
<intervention_type>Procedure</intervention_type>
+
<intervention_type>Procedure/Surgery</intervention_type>
<intervention_name>chromosomal translocation
analysis</intervention_name>
</intervention>
<intervention>
<intervention_type>Procedure</intervention_type>
+
<intervention_type>Procedure/Surgery</intervention_type>
<intervention_name>cytogenetic
analysis</intervention_name>
</intervention>
<eligibility>
I checked the CDR term record on BACH for chromosomal translocation analysis and it is correctly mapped to CTGOVInterventionType of "Genetic" and yet here it is showing up with Intervention_type of Procedure/Surgery.
Could you check. Same with cytogenetic analysis
BZDATETIME::2009-01-13 14:46:26
BZCOMMENTOR::Bob Kline
BZCOMMENT::35
The diff report between sets B and C had only one line in it, showing that CDR532941.xml only appeared in set C, not in set B. The log for set B has the following line:
!5404 Tue Jan 13 09:46:02 2009: failure processing CDR532941: intervention 'mutation carrier screening' has multiple types 'Genetic'; 'Procedure/Surgery'
The code we dropped for set C included logic to fail processing if the hard-coded hierarchy of types was unable to winnow down the number of intervention blocks for any given intervention type to one.
There are a number of "Only in ..." lines in the report of differences between sets A and B:
Only in set-a: CDR256871.xml
Only in set-a: CDR256919.xml
Only in set-a: CDR331829.xml
Only in set-a: CDR378183.xml
Only in set-b: CDR502363.xml
Only in set-a: CDR532941.xml
Only in set-a: CDR629778.xml
Only in set-a: CDR630380.xml
Only in set-a: CDR65713.xml
Only in set-a: CDR67380.xml
Only in set-a: CDR68093.xml
Only in set-a: CDR68106.xml
Here are the error messages (with the timestamps stripped) for the ones which were dropped with the new intervention mapping code:
failure processing CDR256871: missing CT.gov intervention type for
CDR531923
failure processing CDR256919: missing CT.gov intervention type for
CDR37779
failure processing CDR331829: missing CT.gov intervention type for
CDR39187
failure processing CDR378183: missing CT.gov intervention type for
CDR380753
failure processing CDR532941: intervention 'mutation carrier screening'
has multiple types 'Genetic'; 'Procedure/Surgery'
failure processing CDR629778: missing CT.gov intervention type for
CDR630382
failure processing CDR630380: missing CT.gov intervention type for
CDR630596
failure processing CDR65713: missing CT.gov intervention type for
CDR41911
failure processing CDR67380: missing CT.gov intervention type for
CDR37779
failure processing CDR68093: missing CT.gov intervention type for
CDR37779
failure processing CDR68106: missing CT.gov intervention type for
CDR37779
The one that was missing from the first set was caused by a database query timeout error, which is probably extremely rare when the job is running during off hours without CIAT users working.
BZDATETIME::2009-01-13 14:59:01
BZCOMMENTOR::Bob Kline
BZCOMMENT::36
(In reply to comment #34)
> Noticed this in my review - it seems odd
>
> ...
>
> I checked the CDR term record on BACH for chromosomal translocation
analysis
> and it is correctly mapped to CTGOVInterventionType of "Genetic"
and yet here
> it is showing up with Intervention_type of Procedure/Surgery.
>
> Could you check. Same with cytogenetic analysis
Perhaps I have misunderstood what you wanted for the mapping logic. I came away from the meeting we had in my office a few days ago with the idea that we were supposed to use the CTGovInterventionType from the document linked by the protocol's InterventionNameLink element only if the semantic types of that document included "Drug/agent"; otherwise we were supposed to use the CTGovInterventionType from the document linked by the InterventionType element in the protocol document. I think this understanding matches the logic I wrote up in comment #24. Let me know if that's not right.
BZDATETIME::2009-01-14 09:57:04
BZCOMMENTOR::Bob Kline
BZCOMMENT::37
Lakshmi tried to post a reply to my last comment, but Internet Explorer couldn't find Verdi. We came up with a new version of the mapping logic, which I am posting here:
for each Intervention element in the trial document:
find the InterventionType child of the Intervention element
find the document (IT) linked by that element
for each InterventionNameLink child:
find the document (INL) linked by that child element
get the preferred name (NAME) from that document
get the semantic types for the INL document
if any of these semantic types is 'Drug/agent combination':
do nothing
otherwise:
find the CTGovInterventionType value from the INL document
use it as intervention_type, with NAME as intervention_name
if the Intervention element has no InterventionNameLink children:
find the CTGovInterventionType value in the IT document
use it as intervention_type
use preferred name of IT document as intervention_name
In addition, we decided to retain the extra code which looks at trials whose documents have multiple intervention blocks with the same intervention name but different intervention types, but instead of trying to pick one of the blocks using the hard-coded hierarchy of preferred types the software will always fail the export of such trial documents.
Lakshmi:
Please let me know if I've captured this accurately.
BZDATETIME::2009-01-14 16:03:25
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::38
The logic is OK
BZDATETIME::2009-01-14 16:04:50
BZCOMMENTOR::Bob Kline
BZCOMMENT::39
Thanks. Do you want me to hold off on running another test set until CIAT has addressed the failures from the global change?
BZDATETIME::2009-01-14 16:21:14
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::40
(In reply to comment #39)
> Thanks. Do you want me to hold off on running another test set
until CIAT has
> addressed the failures from the global change?
>
Last comments in Issue 4414 should be here. I just copied the comments
to this issue:
CIAT has looked at the errors but we thought that if there is an
intervention
name, CTGovInterventionType supersedes that of the intervention type it
is
pared with?
BZDATETIME::2009-01-15 09:36:20
BZCOMMENTOR::Bob Kline
BZCOMMENT::41
(In reply to comment #40)
> (In reply to comment #39)
> > Thanks. Do you want me to hold off on running another test set
until CIAT has
> > addressed the failures from the global change?
> >
> Last comments in Issue 4414 should be here. I just copied the
comments to this
> issue:
>
>
> CIAT has looked at the errors but we thought that if there is an
intervention
> name, CTGovInterventionType supersedes that of the intervention
type it is
> pared with?
>
Actually, Issue #4414 is the right home for this thread. See Lakshmi's latest comment (#19 in that issue).
I'm holding off on further testing of the export mapping until the loose ends in issue #4414 are resolved.
BZDATETIME::2009-01-16 13:16:06
BZCOMMENTOR::Bob Kline
BZCOMMENT::42
I ran a new set with the latest logic and compared the results to the first set (set A): http://bach.nci.nih.gov/issue4367-sets-a-and-d.html. However, you may decide that enough changes to the documents have been made since Tuesday that the noise makes this comparison report difficult to use, so I'm running a new base set with the production code, and will post the comparison against that set when it's done. The only way to get a pure test result is to stop CIAT from editing documents while I'm running the two sets, or clone the database to Franck and run the tests there. If you think that's necessary, let me know. I know you're anxious to get this done, but I don't know for sure how you view the tradeoff of that urgency with the need to get a clean test comparison.
BZDATETIME::2009-01-16 14:07:27
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::43
It seems like we needed to refresh Franck for some other reason - maybe we should just go ahead and do that and run the reports there.
BZDATETIME::2009-01-16 14:30:42
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::44
If I wanted to look at the CTGOV export XML file for a document, where do I have to go?
BZDATETIME::2009-01-16 14:58:40
BZCOMMENTOR::Bob Kline
BZCOMMENT::45
Volker:
Could you go ahead and refresh Franck so I can run these tests there? Let me know when it's ready.
BZDATETIME::2009-01-16 15:02:22
BZCOMMENTOR::Bob Kline
BZCOMMENT::46
(In reply to comment #44)
> If I wanted to look at the CTGOV export XML file for a document,
where do I
> have to go?
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-a
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-b
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-c
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-d
BZDATETIME::2009-01-16 16:16:48
BZCOMMENTOR::Volker Englisch
BZCOMMENT::47
(In reply to comment #45)
> Let me know when it's ready.
It's ready now.
BZDATETIME::2009-01-16 17:59:37
BZCOMMENTOR::Bob Kline
BZCOMMENT::48
(In reply to comment #43)
> It seems like we needed to refresh Franck for some other reason -
maybe we
> should just go ahead and do that and run the reports there.
>
Here you go:
http://franck.nci.nih.gov/issue4367-sets-e-and-f.html
Here are the documents:
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-e
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-f
BZDATETIME::2009-01-21 12:21:17
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::49
(In reply to comment #48)
> (In reply to comment #43)
> > It seems like we needed to refresh Franck for some other
reason - maybe we
> > should just go ahead and do that and run the reports
there.
> >
>
> Here you go:
>
> http://franck.nci.nih.gov/issue4367-sets-e-and-f.html
>
> Here are the documents:
>
> http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-e
> http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-f
>
Bob,
Are you expecting comments from CIAT at this point?
BZDATETIME::2009-01-21 13:53:55
BZCOMMENTOR::Bob Kline
BZCOMMENT::50
(In reply to comment #49)
> Are you expecting comments from CIAT at this point?
See comment #29 for an outline of the test plan. You need to review the report whose link I posted in comment #48 (sets E and F are the Franck equivalents of sets A and B in the Bach tests; we switched to Franck so the diff report wouldn't have any noise created by concurrent editing of the documents between the two test runs), as well as sample documents from set F.
BZDATETIME::2009-01-21 16:57:42
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::51
(In reply to comment #50)
> (In reply to comment #49)
>
> > Are you expecting comments from CIAT at this point?
>
> See comment #29 for an outline of the test plan. You need to review
the report
> whose link I posted in comment #48 (sets E and F are the Franck
equivalents of
> sets A and B in the Bach tests; we switched to Franck so the diff
report
> wouldn't have any noise created by concurrent editing of the
documents between
> the two test runs), as well as sample documents from set F.
>
We compared the two reports and they look fine. We did not see anything that was not expected.
BZDATETIME::2009-01-22 09:32:32
BZCOMMENTOR::Bob Kline
BZCOMMENT::52
As soon as you've had a chance to review the test results and I get the green light from you, Lakshmi, I'll put this into production.
BZDATETIME::2009-01-22 11:55:26
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::53
Were there any error messages in the logs?
BZDATETIME::2009-01-22 14:18:21
BZCOMMENTOR::Bob Kline
BZCOMMENT::54
(In reply to comment #53)
> Were there any error messages in the logs?
!1120 Fri Jan 16 17:10:36 2009: failure processing CDR269315: no
match for study category 'BIOMARKER/LABORATORY ANALYSIS' found
!1120 Fri Jan 16 17:14:09 2009: failure processing CDR378088: no match
for study category 'BIOMARKER/LABORATORY ANALYSIS' found
!1120 Fri Jan 16 17:19:20 2009: failure processing CDR485360: no match
for study category 'BIOMARKER/LABORATORY ANALYSIS' found
!1120 Fri Jan 16 17:22:04 2009: failure processing CDR547101: no match
for study category 'BIOMARKER/LABORATORY ANALYSIS' found
!1120 Fri Jan 16 17:22:25 2009: failure processing CDR554708: missing
intervention description
!1120 Fri Jan 16 17:23:36 2009: failure processing CDR574195: no match
for study category 'BIOMARKER/LABORATORY ANALYSIS' found
!1120 Fri Jan 16 17:24:06 2009: failure processing CDR581165: no match
for study category 'TISSUE COLLECTION/REPOSITORY' found
!1120 Fri Jan 16 17:26:17 2009: failure processing CDR613100: no match
for study category 'TISSUE COLLECTION/REPOSITORY' found
!1120 Fri Jan 16 17:26:34 2009: failure processing CDR617990: missing
intervention description
BZDATETIME::2009-01-26 09:45:40
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::55
(In reply to comment #54)
> (In reply to comment #53)
>
> > Were there any error messages in the logs?
>
The errors have been fixed. According to Mary:
"I didn’t see any problems that would preclude CT.gov intervention
mapping with the following docs:
485360
547101
"
BZDATETIME::2009-01-26 13:23:36
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::56
According to Mary:
> "I didn’t see any problems that would preclude CT.gov intervention
mapping
> with the following docs:
> 485360
> 547101
> "
Well she needs to look at issues a little more carefully - the error message indicates that 485360 cannot be processed for export because there is no match for the Biomarker/Lab Analysis value since we do not export these studies to CTGOV. i would ask her to review the previous versions of this trial . At some point it went from being a Research study with Primary type of Biomarker Lab analysis to a Clinical Trial with primary type of biomarker lab analysis.
Somehow the the trial is already on CTGOV - as a treatment study. We may have made some code changes since the time the study was originally published and now the trial is really not being updated. Given that it is already registered and JHOC may not be happy if we pull this trial, we may have to adopt a slightly different data standard to this trial. Please ask Mary to call if she has questions.
She may want to look closely at the other trial as well.
BZDATETIME::2009-01-27 11:05:40
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::57
I talked to Bob and we will upload the export job from Franck to PRS Test. VOlker will need to let us know about the error messages that show up in the PRS Admin screen as a result of this load
BZDATETIME::2009-01-27 14:49:02
BZCOMMENTOR::Bob Kline
BZCOMMENT::58
I've been looking at the results of the PRS test and have discovered something fishy in the output the new software generated. I'm seeing trials with multiple intervention blocks, some with arm_group_label children and some without. Not supposed to happen. I am digging in to find out why it did.
BZDATETIME::2009-01-27 15:58:36
BZCOMMENTOR::Bob Kline
BZCOMMENT::59
Found the bug. Will fix and do another test run.
BZDATETIME::2009-01-27 18:27:46
BZCOMMENTOR::Bob Kline
BZCOMMENT::60
(In reply to comment #59)
> Found the bug. Will fix and do another test run.
>
http://bach.nci.nih.gov/cgi-bin/cdr/ViewCTGovExports.py?job=test-set-g
Volker:
Please upload this to PRS Test.
BZDATETIME::2009-01-28 10:11:06
BZCOMMENTOR::Bob Kline
BZCOMMENT::61
Here's the diff file for the latest run with the fixed code:
BZDATETIME::2009-01-30 09:41:28
BZCOMMENTOR::Bob Kline
BZCOMMENT::62
(In reply to comment #60)
> Please upload this to PRS Test.
The test came out clean.
BZDATETIME::2009-02-03 09:37:19
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::63
Go ahead and move changes to production so we can publish with the new code tonight.
BZDATETIME::2009-02-03 09:47:19
BZCOMMENTOR::Bob Kline
BZCOMMENT::64
Promoted to Bach.
BZDATETIME::2009-02-04 11:02:09
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::65
(In reply to comment #56)
> Somehow the the trial is already on CTGOV - as a treatment study.
We may have
> made some code changes since the time the study was originally
published and
> now the trial is really not being updated. Given that it is already
registered
> and JHOC may not be happy if we pull this trial, we may have to
adopt a
> slightly different data standard to this trial. Please ask Mary to
call if she
> has questions.
>
> She may want to look closely at the other trial as well.
>
Lakshmi:
Mary mentioned to me yesterday that you had decided to map the research
trials that fail to be exported to intervention type - "Other" so that
they could successfully be exported. Should I put in a different issue
for this case or it would be taken care of under this issue?
BZDATETIME::2009-02-04 15:39:01
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::66
I think the suggestion from Mary was to map all Laboratory Analysis/Biomarker studies to have study_type of Interventional and interventional_subtype of Other. There are some other issues with regard to mapping these as above:
1. Make sure that these studies have Number of Arms value mapped correctly - they should all be single arm.
2. They should have Outcome measures, PrimaryCompletion Dates, FDA Regulated information blocks - at least for the ones that were active as of the Dec 26, 2007 cutoff date
3. Make sure these trials will all have Interventions
Here is the requirement for Intervention studies from CTGOV DTD
Primary and secondary outcomes are required for interventional
studies,
optional for observational studies.
Primary completion date and type are required for interventional
studies,
optional for observational studies.
For interventional studies, if number_of_arms > 1, the
corresponding
number of arm_group tags must be included.
BZDATETIME::2009-02-05 08:24:17
BZCOMMENTOR::Bob Kline
BZCOMMENT::67
Lakshmi:
I wasn't sure how much of the previous comment was directed to me (as "go ahead and make the software take care of this") and how much to CIAT (for them to do data cleanup and QA), and whether you intended William to create a new issue for handling these trials, or to piggy-back the additional work on this issue.
BZDATETIME::2009-02-17 20:21:51
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::68
Closing this issue per CDR Meeting of 02/12/2009.
A new issue #4487 has been created to address the mapping of Biomarker/Lab Analysis studies.
File Name | Posted | User |
---|---|---|
failures.log | 2008-12-02 15:58:15 | |
InterventionNameSemanticTypes.xls | 2008-12-11 16:04:25 | |
InterventionNameSemanticTypes.xls | 2008-12-09 11:04:01 | |
multiple-types.log | 2008-12-02 14:41:32 |
Elapsed: 0:00:00.001778