Issue Number | 3646 |
---|---|
Summary | New status for abandoned trials |
Created | 2013-08-19 11:59:53 |
Issue Type | New Feature |
Submitted By | Beckwith, Margaret (NIH/NCI) [E] |
Assigned To | alan |
Status | Closed |
Resolved | 2013-09-25 18:39:35 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.112109 |
We need to create a new status for Abandoned Trials that will move them out of the Active category into the Closed category on Cancer.gov. These are trials that still have a status of Active but have not been updated for years. We can't really officially change the status to Closed, so Lakshmi and I talked about having a status similar to CT.gov of "Status Unknown" (or something along those lines)that would sort with Closed/Completed trials in the search form. All of these trials have been identified, and they have a note in the title that says they are no longer being updated. It looks really bad to have them in the Active set. There are around 300 trials in this category.
I assume the search form referred to above is on the cancer.gov site, right? If so, are they able to control the sort order of the status values on the form manually? Or do we need to come up with an artificial status name contrived to sort in the desired position (e.g., "Couldn't Determine Status")?
Yes, I was refering to the search form on Cancer.gov. In terms of searching, there are only two buckets: Active and Closed. In the backend, trials that have the statuses of Active and Approved-Not-Yet-Active show up in the Active bucket. Trials that have the status of Closed or Completed show up in the Closed bucket. We would like to add another trial status value of "Status Unknown" and have trials with that value go into the Closed bucket. The status value itself does show in the description of the trial on Cancer.gov. Does this make sense? I would like to add Lakshmi to this issue, but can't seem to figure out how to do that. I want her to weigh in (and possibly clarify if I have muddled it🙂.
I added Lakshmi (by clicking on the number next to "Watching" above).
Thank you!
We're talking about the InScope trials, not CT.gov, right? So we'll add "Unknown" as a new value to the enumerated ProtocolStatusName type. We'll need to have Volker verify that the publishing filters will just pass the new value through (that is, the filter logic isn't dependent on hard-wired values). If I'm reading what you wrote correctly, it sounds like cancer.gov will be given the new value, will display it, and will modify their search software to fold in the new value into the "Closed" search bucket on their end, right? Do we need to modify the custom rule which checks for a Yes or No value in the FDARegulated element so that it accepts a document without the element if the overall status is "Unknown"? We'll need to check the code in the CDR Server which determines what the overall status value is based on the individual lead org statuses to see if that code needs modification.
Lakshmi: Margaret was hoping you'd throw your oar in on this issue and confirm whether we're heading in the right direction.
You are heading in the right direction. Could we ask Blair to tell us what it would take to make the change at Gatekeeper end to add trials with the Unknown status value into the "Closed" bucket on Cancer.gov. That way the change in status can be coordinated.
The CDR server function that determines overall protocol status does check lead org status and I think you cannot mark a protocol as closed if at least one lead org is active. Do you think it might help just to globally change the status of the lead orgs in this case to closed. That is not really published anywhere and if it helps reduce the pieces of code that need to be modified, the better it is.
Anyway, its been so long since I have thought about protocol status setting, Im afraid I dont remember the details very well.
Users can't set the overall status themselves. That's done in the server at document save time, based on the lead org statuses using some precedence logic. So the user would mark the statuses of the lead orgs as "Unknown" and that will bubble up to the overall status element when the document is saved. It might be possible to let the users only set one of the lead orgs' statuses to Unknown, and have the logic in the server set the overall status to Unknown if any of the lead org statuses is Unknown, but I think that would violate the Rule of Least Surprise. :-)
Blair:
I've added you as a watcher to this issue to you can give us an idea of what would be involved in modifying the Gatekeeper logic to map the new "Unknown" protocol status value to "Closed" for the clinical trials search interface (see Lakshmi's earlier comment). Can you confirm that the trial display would show the correct status value ("Unknown")? What's the likelihood that this could get into the next WCMS release? If that's not possible, would the change on our end be benign, or would we have to defer introducing the new status value until the changes to Gatekeeper were in production?
I won’t have time to take a detailed look for a few days, but part of the answer depends on the desired outcome.
On the Clinical Trials search page, the decision between “Active (currently accepting patients)” versus “Closed (not accepting patients)” selects between a True/False value designating whether a trial “Is Active.” If it’s OK for that semantic to become “Active trials versus ‘everything else’”, then that part at least should be relatively straightforward. Adding a third-state would definitely complicate matters.
The logic for deciding whether the Is Active flag goes to True or False should reside in GateKeeper. I’d have to do some research to see how GateKeeper would react to a new value.
We're not looking for a third state. It is OK for the semantics to lump trials with the value "Unknown" into the "everything else" bucket.
We actually also include trials with the status "Approved Not Yet Active" in the Active bucket on Cancer.gov. But everything else goes into the Closed bucket, and it is fine to include the trials with the "Unknown" status in that bucket.
Going on the belief that the element we’re talking about is CurrentProtocolStatus.
The GateKeeper logic is
If the text of the CurrentProtocolStatus element is either “ACTIVE”
or “APPROVED-NOT YET ACTIVE” Then
IsActive = True.
Else
IsActive = False.
End If
There’s a little more to handle case-sensitivity, but that’s the part we’re most interested in.
Excellent! That takes care of the search logic. If we can also verify that the display will show whatever value comes its way, we'll be good to go.
THis is really promising - thanks for checking Blair.
This requires testing, but digging through the code, it appears that GateKeeper saves whatever text it’s given for the status and the display logic just shows it.
The schema change has been installed on DEV:
Schemas/InScopeProtocol.xml (r12015)
I have determined that no changes need to be made to the CDR server code, as we'll only use the new status value for a trial if no lead orgs have another status value (see my earlier comments on this issue), and the existing code will handle that already. There's still an outstanding question about the custom FDARegulated custom validation rule (see above).
I've passed the issue on to Volker, so he can verify that the publishing filters will pass along the new value.
The CurrentProtocolStatus field is copied as is. We do have a few templates that are only performing certain tasks if the status is 'Active' and/or 'Approved' (i.e. display sites) but given the fact that the new value should be used as a flavor of 'Closed' this will work fine without a filter change.
We decided in the status meeting to create a global change to change the lead org statuses for the roughly 300 trials for which William will provide the list of document IDs.
Here is my understanding of what the global change has to do. Please
let me know about any incorrect statements below:
Read a list of CDR IDs for documents that must be changed from an
external file.
The format of the file is to be determined. It might be a
spreadsheet, a simple text file, or conceivably even an HTML
page to be parsed to find the IDs.
CIAT will provide us with the list after we agree on the
format.
For each CDR ID:
Fetch the document from the database.
If it's not an InScopeProtocol:
Produce an error message and do not modify the document.
Else verify that no instances of the following element exist
(see Bob's JIRA comment, the third one above this one):
/InScopeProtocol
/ProtocolAdminInfo
/ProtocolLeadOrg
/LeadOrgProtocolStatuses
/CurrentOrgStatus
/StatusName
If even one such element exists:
Produce an error message and do not modify the document.
Else assign the value "Unknown" to the following element,
replacing whatever is its current value:
/InScopeProtocol
/ProtocolAdminInfo
/CurrentProtocolStatus
and save the document.
Have I missed anything or got anything wrong?
Okay, I think I am a little confused. The trials that we are trying to identify have an overall status of Active (hence they are showing up in the Active bucket on Cancer.gov). Does't this mean that they have at least one lead org status of Active or Approved not yet Active? We need to change the lead org statuses to Unknown. I am probably muddling this up.
My other comment is that we had briefly talked about the possibility of identifying the set of trials to change by looking for trials that have the "Abandoned" value element and an overall status of Active. Is that possible? William--weigh in here if I missed something (and maybe you could correctly identfy the element I am referring to).
Yes, the logic should really be:
For each CDR ID:
Fetch the document from the database.
If it's not an InScopeProtocol:
Produce an error message and do not modify the document.
Else:
For each lead org in the document:
Change the CurrentOrgStatus element's tag to PreviousOrgStatus
Insert as the first child of the LeadOrgProtocolStatus block:
<CurrentOrgStatus>
<StatusName>Unknown</StatusName>
<StatusDate>THE CURRENT DATE</StatusDate>
<Comment>Added by global change for JIRA task OCECDR-3646</Comment>
<EnteredBy>CDR LOGIN FOR USER RUNNING THE JOB</EnteredBy>
<EntryDate>THE CURRENT DATE</EntryDate>
</CurrentOrgStatus>
As for having the software identify the documents to be modified (if that's what you meant by "identifying"): if the "value element" we're looking for is something embedded in a free-text comment, that would be risky for the same reasons that approach is always risky. It might be possible for Alan to write a report which parses all the candidate documents looking for that word in a comment, and CIAT manually examines the report to come up with a definitive list of the documents which should actually be modified. I'll let Margaret, Alan, and William negotiate what they think is the most efficient use of Alan's and CIAT's time.
The element I was referring to was one we added a few months ago: "Abandoned" is a new value in the list of values of the CTGovOwnershipTransferContactResponse element of the CTGovOwnershipTransferContactLog block in the InScopeProtocol Schema.
Ah, yes. I see that now. That should be usable (though that element will need to be indexed first). Are the semantic differences between "Abandoned" and "No response" commonly understood for this context?
I think so. Abandoned was specifically added for the set of trials that no one had touched in a long time and will need to go into the abandoned account at CT.gov. I am comfortable with using that for identifying this set of trials. But if this is a lot of extra work, it is VERY easy to do a search on Cancer.gov and find this set. Pulling out the CDR IDs from that search could be time consuming though. William--any thoughts?
Indexing the element is not a lot of work.
In fact, it's already done.
Reading over the comments I see I took something out of context and came up with the wrong idea about what needs to be done. But I think I've got the algorithm for the document changes now based on Bob's correction of my text.
Can someone post specific instructions for the search on cancer.gov that selects all of these documents?
Added my name to the watching list.
Thank you William!
Can someone (William?) post specific instructions for the search on cancer.gov that selects all of these documents?
Thanks.
Alan, we were hoping that the set of trials could be identified by using the Abandoned value (that is now being indexed), see comments from last Friday above. It would be the set of trials that have an overall status of Active/Recruiting but not Yet Active (they go into the Active bucket on Cancer.gov) and have that Abandoned value. One problem that Ning brought up today is that apparently the Keyword search that we used before to identify this set of trials on Cancer.gov is not working now.
Alan, I picked up 292 active trials (in PROD) with this query
SELECT q.doc_id, q.value, s.path, s.value
FROM query_term_pub q
JOIN pub_proc_cg c
ON c.id = q.doc_id
JOIN query_term_pub s
ON q.doc_id = s.doc_id
WHERE q.value = 'Abandoned'
AND s.path = '/InScopeProtocol/ProtocolAdminInfo/CurrentProtocolStatus'
AND s.value not in ('Completed', 'Closed', 'Temporarily closed')
ORDER BY s.doc_id
(Note: Testing to enter text in JIRA as monospaced text)
Alan,
I have attached a file from a query of the Production server for all
trials that have been marked as abandoned. It is a simple query:
SELECT doc_id
FROM query_term
WHERE Path =
'/InScopeProtocol/CTGovOwnershipTransferContactLog/CTGovOwnershipTransferContactResponse'
AND value = 'Abandoned'
This is the complete set of abandoned trials but you need to exclude all trials with a status other than, Active and Approved not-yet active.
I think what you've got is the right set. I will review some of the trials to confirm.
I did look at a few of the trials and they were all correctly selected. Would it be possible to refresh DEV for any possible tests since the trials marked 'Abandoned' are not on DEV yet? I haven't check QA but they may be on QA.
The trials are on QA so I guess we can test on QA if necessary.
Volker,
Since we've been talking about possibly stopping publishing of InScope
trials in the near future, can you query PROD to find out if there are
any InScope trials besides these (Abandoned) that are either Active or
Approved Not-yet active?
Unfortunately, refreshing DEV is much less straightforward than refreshing QA. There are filters, schemas, and other things that are newer on DEV than PROD that need to be preserved during the refresh. We had a script to do this on our old servers but we've not tried it yet on the CBIIT servers and it probably needs revisions. So we have to coordinate that with CBIIT and do a careful test. Like everything else now, it could turn out to be a big effort.
We'll have to do it eventually, but probably not in time for this test.
However, QA should work okay for testing this, especially because we'll be able to run the global change in test mode until we think it's exactly right without mangling anything in the database or altering test conditions.
> find out if there are any InScope trials besides these
(Abandoned) that are
> either Active or Approved Not-yet active?
I see 462 such trials, 88 of those have been published to Cancer.gov. Here are the first 20 of the 462.
doc_id value
---— ---------
63391 Active
64228 Active
64802 Active
65530 Active
66312 Active
66485 Active
66489 Active
66490 Active
66504 Active
66507 Active
66510 Active
66512 Active
66513 Active
66514 Active
66525 Active
66543 Active
66585 Active
66885 Active
67084 Active
67218 Active
Thanks! Generally, we are only interested in the 88 trials that have been published to Cancer.gov. I will review them before our meeting on Thursday. We probably should discuss at the meeting whether to treat these as abandoned trials as well. Could you post your query or email it me so that I can get a complete list of the trials?
I wrote a query similar to Volker's that produced the same count of 292 document IDs, as follows:
64165 64500 65880 66847 67357 67465 67536 68354
68388 68476 68911 68969 69293 69457 256532 271424
341437 343699 347463 358797 361751 361760 365544 378144
386240 409723 413706 425383 427312 429610 441158 449719
450162 453316 454543 454570 454596 454721 455040 455087
455125 455569 455572 455583 455588 455738 456203 456480
456773 460074 463518 466676 467994 468031 471769 472206
472976 481365 482277 485428 487602 491440 491451 492266
495321 495777 508635 509044 510284 513051 515900 516004
516823 517194 517312 523378 526121 526239 526299 526368
528021 528289 529353 530026 531136 531140 532934 532941
532943 533828 537042 538115 538879 539352 539539 540180
540233 548777 551555 551556 551557 551559 553120 553251
554297 557417 560114 560121 560128 561066 561076 561079
561733 564820 566209 570041 571546 571634 573199 573340
574037 574344 574367 574585 576425 576439 577728 581139
581143 581165 581176 582315 584254 584270 584278 584446
586420 586791 587470 587495 587504 587517 587523 587746
587987 588423 588427 588868 589004 589199 589227 589230
589308 590089 590649 592728 593562 593564 593698 594671
596572 597000 597895 597903 598877 598878 598879 598880
598881 598882 599206 599372 599886 600332 601175 601214
601695 612567 612590 613601 614811 614912 615602 615902
617983 626194 629681 629824 631252 632144 632722 633348
634652 635953 636332 636371 636859 636974 637053 637622
637640 637812 638974 639017 639096 639513 639649 639659
640330 640379 640493 640500 641101 641288 641383 641937
642221 642751 643641 643743 644123 644893 647658 648274
649021 649054 649670 649750 649763 649812 649867 649890
650138 650654 650829 651250 652115 652306 652331 652936
653093 655148 655182 657523 658351 659192 659310 660317
660324 661071 661288 665188 666511 666842 667211 667364
667766 668525 668528 669246 669712 669716 669914 671002
671070 671670 671673 672171 674580 681693 682204 682206
683850 683852 683940 683942 684018 684020 686456 686459
686602 687338 688119 688122 689973 694647 695000 695270
695874 697324 697471 699222
Volker's query joins a table of all document IDs on cancer.gov, excluding any that are not on CG. If I eliminate that restriction, one more document is added: CDR0000595184. It's listed as an Active trial but it's blocked and has the comment:
"WITHDRAWN per cleanup for "List of NLM Studies with ArmsOrGroups" report - Keeping ct.gov CDR 749703/NCT00707083; 2013-06-12 jstringer/nyu."
The query I'm using that gets all 293 is:
SELECT q.doc_id
FROM query_term_pub q
Uncomment if we want to limit to docs currently on cancer.gov
JOIN pub_proc_cg c
ON c.id = q.doc_id
JOIN query_term_pub s
ON q.doc_id = s.doc_id
WHERE q.path =
'/InScopeProtocol/CTGovOwnershipTransferContactLog/CTGovOwnershipTransferContactResponse'
AND q.value = 'Abandoned'
AND s.path =
'/InScopeProtocol/ProtocolAdminInfo/CurrentProtocolStatus'
AND s.value IN ('Active', 'Approved-not yet active')
ORDER BY q.doc_id
Here's the query more readably formatted:
SELECT q.doc_id
FROM query_term_pub q
-- If we want to limit to docs currently on cancer.gov
-- JOIN pub_proc_cg c
-- ON c.id = q.doc_id
JOIN query_term_pub s
ON q.doc_id = s.doc_id
WHERE q.path = '/InScopeProtocol/CTGovOwnershipTransferContactLog/CTGovOwnershipTransferContactResponse'
AND q.value = 'Abandoned'
AND s.path = '/InScopeProtocol/ProtocolAdminInfo/CurrentProtocolStatus'
AND s.value IN ('Active', 'Approved-not yet active')
ORDER BY q.doc_id
I completed the whole global change and ran it on in test mode on a sample of 50 documents. It appeared to me that the output is correct. The output is on QA in:
https://cdr.qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2013-09-17_20-39-39
These were pretty spare documents. Only one of the 50 had more than one LeadOrg, but I think it was transformed correctly.
All of the documents fail validation because the QA server does not yet have "Unknown" added to the list of valid values in the schema. We don't care much about that for test purposes but the value has to be there on PROD or else we can't save publishable versions of the documents. I'll talk to Bob and Volker about that on Thursday.
The new value is installed on QA so please run another global in test mode.
The global is complete. Results are in:
https://cdr.qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2013-09-19_12-43-09
I have reviewed a few trials from the test results of the global and they all appear to be okay. Please proceed to run the global in live mode on QA.
The global ran successfully in live mode. I will attach the log file.
Here is the log file for the live run on QA.
I'm currently blocked from pushing any documents to the QA Gatekeeper. When Bryan tells me I can send documents again I will push the protocol that's waiting in the queue.
Is this one of those instances where we can use DT (gatekeeper-dt.qa.cancer.gov)?
I don't know what gatekeeper-dt is used for but if Bryan wants me to send the document(s) to gatekeeper-dt I can do that.
The protocol with the new status of 'Unknown' zipped through Gatekeeper without problems and the new status is displayed on Cancer.gov (www-dt.qa.cancer.gov).
Hot diggity!
Where can I review the trials at? I tried the URL Erika provided (http://gatekeeper-dt.qa.cancer.gov/) and got a 403 error.
You'll have to be on the QA Bastion host and go to
http://www-dt.qa.cancer.gov/
You need to logon to the QA bastion host.
From there, if what you want to review is the trial documents that were modified by the global change, you can do that in the usual way. Get the list of document IDs from the attached file 3646.log. Then using the CDR IDs from the logfile, you can view any trial document using the usual tools, e.g., XMetal, the Admin subsytem document reports, or the XML display viewer at:
https://cdr.qa.cancer.gov/cgi-bin/cdr/show-cdr-doc.py
What Volker did was upload a single trial to a QA version of the Gatekeeper for cancer.gov in order to see if cancer.gov could handle the change okay. It apparently worked though when I just tried the gatekeeper.dt url it didn't work for me either. I think the posted URL must either be incomplete, or the server is inaccessible. I also tried the https version of the same URL without success.
Maybe Volker can weigh in on that, but he's gone home for the day.
If you want to go to Gatekeeper you'll have to use this URL:
http://gatekeeper-dt.qa.cancer.gov/admin/Home.aspx
but I don't think that's what you want to do. You want to see that the
document made it to Cancer.gov which is (as listed above) this
URL:
http://www-dt.qa.cancer.gov/
Please be aware that the site is veeerrrry slow (it may be running on an old Atari). :-)
I have reviewed several trials from the global and they all looked
good. I have to review the logs for any possible validation errors. I
was able to access the site this time around. It looks like the site was
down yesterday.
Volker, what is the ID of the trial you published?
You probably tried accessing the site while the backup was running
which makes everything so much slower.
The protocol is CDR699222 (WCTU-EL-CID).
Thanks, Volker. I think we're all set. I reviewed the log file and didn't see anything unexpected.
Verified on QA
Volker:
This issue is good to go but I was wondering if you want to do some test
publishing. I don't think reviewing the XML file alone is sufficient. It
will be good to see how it shows up on the test site.
I actually already ran a "nightly" publishing job yesterday
afternoon.
Do you need to do anything on the QA site or was this dependent on the
global that Bob ran?
It should be okay to test then. I don't think you need to run publishing again since the global for this issue was ran earlier. Could you please give the URL to the test site?
The nightly publishing job finished and pushed the documents to
gatekeeper-dt.qa.cancer.gov.
I believe the proper front-end for this GK version is
://www-dt.qa.cancer.gov/ http
which - again - is very slow.
We checked a few of the trials but the changes are not on the test site. The statuses of the trials still reflect the old status before the global was done to change the statuses to Unknown and the trials appear to be in the Active trials section of the test site. Here are two of them:
I see. We tested this with an individual document being hot-fixed and
a hot-fix is being taken as is. However, the regular publishing job
selects the protocols based on their status values and we have not yet
modified the list of allowed values in the publishing document.
I will add the 'Unknown' status to the list of closed InScopeProtocols
and rerun a publishing job on QA.
I've updated the publishing document and submitted a new publishing job.
R12113: CDR178.xml
The publishing job on QA finished and the two protocols William listed are now displaying with the protocol status of 'Unknown'.
We will install the new version of the publishing control document from the bastion host as part of the release process (so the package we're giving to CBIIT won't be affected).
Verified on QA. Thanks!
The global change for this task has been run on the production server; please verify.
Volker: do you want me to install the new publishing control document, or would you prefer to control the timing of that?
Volker: do you want me to install the new publishing control document, or would you prefer to control the timing of that?
Sure, you can install it. I actually had already forgotten about it.
Done. I'm glad you promised to check tomorrow night's publishing job. 🙂
This verified on Prod and Cancer.gov. Can I begin closing issues that have been verified on Prod?
Can I begin closing issues that have been verified on Prod?
Yes, please do.
File Name | Posted | User |
---|---|---|
3646.log | 2013-09-23 15:31:15 | |
ABANDONED TRIALS.txt | 2013-09-17 17:02:23 | Osei-Poku, William (NIH/NCI) [C] |
Elapsed: 0:00:00.003004