Issue Number | 2223 |
---|---|
Summary | [CTGOV Import] Modify the Import program to download new and updated data only |
Created | 2007-05-30 15:29:43 |
Issue Type | Improvement |
Submitted By | Grama, Lakshmi (NIH/NCI) [E] |
Assigned To | |
Status | Closed |
Resolved | 2014-11-05 11:01:41 |
Resolution | Won't Fix |
Path | /home/bkline/backups/jira/ocecdr/issue.106551 |
BZISSUE::3279
BZDATETIME::2007-05-30 15:29:43
BZCREATOR::Lakshmi Grama
BZASSIGNEE::Bob Kline
BZQACONTACT::Lakshmi Grama
We need to explore if the daily download of data from CTGOV can be modidified to only download those trials that have had updates since the last download as well as new trials that have been added since the last download. CTGOV has informed us about the syntax to use for incremental downloads.
BZDATETIME::2007-05-30 17:21:54
BZCOMMENTOR::Bob Kline
BZCOMMENT::1
I have begun testing of the new syntax.
BZDATETIME::2007-06-07 12:55:31
BZCOMMENTOR::Bob Kline
BZCOMMENT::2
We're going to need to coordinate this with the other issue (not yet filed, I don't think) dealing with the undocumented ❓ limitation on the complexity of the queries that we can submit to NLM.
BZDATETIME::2007-06-12 09:44:53
BZCOMMENTOR::Bob Kline
BZCOMMENT::3
I recommend that we put off further work on this task until we've put to bed the problems with query limitations in CT.gov. Implementing this enhancement will make those problems worse.
BZDATETIME::2007-07-05 13:42:01
BZCOMMENTOR::Bob Kline
BZCOMMENT::4
Lowered priority at Lakshmi's request.
BZDATETIME::2009-02-24 13:37:28
BZCOMMENTOR::Volker Englisch
BZCOMMENT::5
Removing Sheri from the CC list.
BZDATETIME::2009-06-30 09:40:59
BZCOMMENTOR::Bob Kline
BZCOMMENT::6
Not an active task right now.
BZDATETIME::2010-11-01 16:04:05
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7
I'm guessing that this task won't be addressed any time soon.
Shouldn't it rather be canceled instead of keeping it around as a
P10?
BZDATETIME::2010-11-01 16:45:39
BZCOMMENTOR::Bob Kline
BZCOMMENT::8
Unlike some of the other tasks which you have correctly identified as obsolete (or very nearly so), such as those connected with electronic mailers which will soon be turned off (if they haven't been already), this task is for importing documents from NLM, which we don't anticipate will ever go away (please correct me if I'm wrong, Lakshmi). Also, although I'm usually the one arguing against aggressive measure to reduce disk space usage, this is one area in which we could profitably invest in work on the software which would cut down dramatically on what we need to store on the production CDR server without any reduction in functionality or data safety. I would not want to eliminate the ability to submit a full query to NLM, but rather retain it as something we could use to periodically ensure that nothing has fallen through the cracks, while using the more efficient method on a daily basis.
In an email message sent by Nick Ide back in May of 2007, he provided the syntax for narrowing our queries to just get new or changed trials, using the following patterns:
http://clinicaltrials.gov/ct/search?term=%22May+15%2C+2007%22+:+MAX+%5BFIRST-RECEIVED-DATE%5D
http://clinicaltrials.gov/ct/search?term=%22May+15%2C+2007%22+:+MAX+%5BLAST-CHANGED-DATE%5D
For some reason, that had never been captured in this issue. If we ever actually come back and work on this (and it does seem like one of the issues on hold which might actually be worth while), we'll need this syntax information.
Note to myself: the original email thread used the subject line "Refinement to the CT.gov search interface" and spanned the date range 2007-05-22 through 2007-06-22. All the messages are in my email archives in the cips-2007 folder.
... this task is for importing documents from NLM, which we don't anticipate will ever go away ...
Never say "never"! Looks like that assumption was wrong. Closing this ticket. 😃
Elapsed: 0:00:00.001537