Issue Number | 73 |
---|---|
Summary | [Queue] Old/Incomplete citations appearing in queue |
Created | 2013-09-17 14:25:26 |
Issue Type | Improvement |
Submitted By | Shields, Victoria (NIH/NCI) [E] |
Assigned To | |
Status | Closed |
Resolved | 2020-02-06 10:32:29 |
Resolution | Won't Fix |
Path | /home/bkline/backups/jira/oceebms/issue.113389 |
TIR #2494 entered 2013-04-04 by Victoria Shields (Future Release status)
While reviewing citations, I noticed an old citation (PMID: 17036354) from 2006 in my queue. It was imported in Feb 2007 and all action associated with this citation seems to date back to around that time, so I don't know why it appeared now. I also noticed several newer citations that were imported without the publication year. See PMIDs 22920360 & 23235761 as examples.
Shields, Victoria (NIH/NCI) [E]No presence information (7/24/2013
6:00 PM): It looks like a siginificant amount of work would be needed to
change this, so since we don't yet know how big a problem it is (I
haven't noticed any new references with missing dates, although perhaps
others have), let's put this on hold for now and wait and see if changes
are really necessary.
Kline, Bob (NIH/NCI) [C]No presence information (5/9/2013 4:19
PM):
Kline, Bob (NIH/NCI) [C]No presence information (4/17/2013 3:03 PM): We
have two separate issues in one TIR here. It's easiest to track the
issues when we have a separate TIR for each issue. First issue:
It looks like PMID 17036354 was in the legacy system, with the appropriate state information which would put it in the board manager's queue for the EBMS. It was in the mt_REVIEW table for the Adult Treatment board (that's the table which recorded the librarians' "publish" decisions) and there were no rows indicating a decision taken for this board. See attached screen shot from the development system clone of the legacy system at the point in time when the conversion was done.
Second issue:
NLM uses two different XML structures for storing the publication date for an article. One of them looks like this:
<PubDate>
<Year>...</Year>
<Month>...</Month>
<Day>...</Day>
</PubDate>
That's the one I was aware of when I wrote the conversion software.
Sometimes NLM stores the publication date this way:
<PubDate>
<MedlineDate>...</MedlineDate>
</PubDate>
The conversion software didn't pick those up.
Alan's import software is smart enough to handle both structures, so imports done after the conversion will get the date either way. If it's important that we go back and fill in the missing dates for the older articles, we can look into what would be involved. It's unfortunate that this didn't get spotted until now, when we don't have the ability to do global changes in the database directly, without going through CBIIT, but I think it should still be possible to create a script, test it on the lower tiers, and get CBIIT to run it for us. Let us know how we should proceed.
We do have the information in the XML we've stored in the source_data column, we just don't have it extracted into the published_date column.
Some parts of the EBMS call for showing the year of publication. In
those cases, we're going directly into the XML and pulling the value out
of the Year element from the first structure described above. In those
cases, we won't have anything to offer for the articles in which NLM has
used the second, free text "structure."
Shields, Victoria (NIH/NCI) [E]No presence information (4/4/2013 4:05
PM):
What's the status of this issue (these issues) now that we've had time to work with the system?
None of us seem to be having any problems with old/incomplete citations lately (not since the citation publishing problem was fixed that we can recall), so I'm going to close this issue.
This issue has crept up again, so I'm re-opening it. Victoria has an old (2004) citation in her queue (PMID: 15542800). Here's what she said:
"It has a weird history. It went to my board in 2005 but no one replied (not the weird part) and it also went to the Peds board in 2005 but was given a “no” by a Peds reviewer. But then it must have gotten a second chance because it was ultimately approved at a Peds board meeting in 2012. Sharon just uploaded the pdf to the EBMS (housekeeping, I’m guessing), which might be what triggered it to appear in my box, but I don’t think the EBMS is supposed to work that way!
I’ll just give it a no to get it out of my queue, but thought maybe I should mention it in case Bob wanted to take a look."
You're right about the triggering effect of posting the PDF: the queue for full text review won't include any articles for which the system doesn't have any full text to review. The article did indeed go out to review for the Renal Cell Cancer topic, but it had also been assigned the "Hairy Cell Leukemia" topic, for which it hadn't gotten past the "Passed Board Manager" state, so it was waiting for a decision for that topic based on the full text review.
Just to capture what the state table has for my own reference:
mysql> SELECT b.board_name, tp.topic_name, t.state_text_id, s.status_dt, s.current
-> FROM ebms_article_state s
-> JOIN ebms_article_state_type t
-> ON t.state_id = s.state_id
-> JOIN ebms_board b
-> ON b.board_id = s.board_id
-> JOIN ebms_topic tp
-> ON tp.topic_id = s.topic_id
-> WHERE article_id = 53098
-> ORDER BY s.status_dt;
+---------------------+------------------------------------------------+--------------------+---------------------+---------+
| board_name | topic_name | state_text_id | status_dt | current |
+---------------------+------------------------------------------------+--------------------+---------------------+---------+
| Adult Treatment | Hairy Cell Leukemia | ReadyInitReview | 2004-11-17 12:03:33 | N |
| Adult Treatment | Renal Cell Cancer | ReadyInitReview | 2004-11-17 12:03:33 | N |
| Pediatric Treatment | Wilms' Tumor and Other Childhood Kidney Tumors | PassedInitReview | 2004-11-17 12:03:33 | N |
| Pediatric Treatment | Wilms' Tumor and Other Childhood Kidney Tumors | Published | 2004-11-18 13:11:20 | N |
| Adult Treatment | Hairy Cell Leukemia | RejectInitReview | 2004-11-24 12:03:33 | N |
| Pediatric Treatment | Wilms' Tumor and Other Childhood Kidney Tumors | PassedBMReview | 2004-11-24 18:23:53 | N |
| Adult Treatment | Renal Cell Cancer | PassedInitReview | 2005-01-03 00:00:00 | N |
| Adult Treatment | Hairy Cell Leukemia | Published | 2005-01-13 09:50:28 | N |
| Adult Treatment | Renal Cell Cancer | Published | 2005-01-13 09:50:28 | N |
| Pediatric Treatment | Wilms' Tumor and Other Childhood Kidney Tumors | PassedFullReview | 2005-01-13 11:51:01 | N |
| Adult Treatment | Renal Cell Cancer | PassedBMReview | 2005-02-01 11:25:39 | N |
| Adult Treatment | Hairy Cell Leukemia | PassedBMReview | 2005-02-01 11:25:39 | Y |
| Adult Treatment | Renal Cell Cancer | PassedFullReview | 2005-02-22 15:52:53 | Y |
| Pediatric Treatment | Wilms' Tumor and Other Childhood Kidney Tumors | FinalBoardDecision | 2012-11-21 14:20:33 | Y |
+---------------------+------------------------------------------------+--------------------+---------------------+---------+
14 rows in set (0.00 sec)
Do we need to keep this ticket open? There are two different issues represented by the ticket. The decision was made that we wouldn't do anything about the first issue and the ticket was closed. For the second issue I believe we determined that the software is behaving correctly. Let me know if that's not right.
We'd like to keep this ticket open, but I'll move it to the 'on hold' status. The problem continues to happen, although, as you said, the software is behaving as intended. As a general rule, citations that received a "no" decision at the full-text state did not get recorded in the CiteMS, so this issue will continue to appear from time to time as PDFs are added to older citations. We may want to revisit this in the future. One possibility we discussed is to flag older citations in our queues (using a certain date cutoff) so we immediately know they are older.
File Name | Posted | User |
---|---|---|
109835.jpg | 2013-09-17 14:25:26 |
Elapsed: 0:00:00.000633