Issue Number | 203 |
---|---|
Summary | [Search Database] Add Option to View Unpublished ONLY, NOT listed ONLY, and Rejected ONLY |
Created | 2014-06-09 17:25:15 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2014-08-03 08:04:00 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/oceebms/issue.129188 |
The medical librarians would like to have the ability to search the database for ONLY unpublished, NOT listed, and/or Rejected citations. I think the best way to accomplish this would be to add the following new checkboxes to the Search the Database page:
__ NOT LISTED ONLY
__ REJECTED ONLY
__ UNPUBLISHED ONLY
These options would limit the search results to ONLY yield citations that had been not listed, rejected, and/or had not been published. The checkboxes should be added to the Administrator Search section. Please leave them unchecked by default.
Adding Cynthia and Minaxi.
Ready for user review on DEV.
I have tested this in dev and it seems to be successfully retrieving the ONLY citations.
Cynthia and Minaxi have tested this one and provided the following feedback:
Rejected Only checkbox – seems to be working when limited by topic or
board (see comments from OCEEBMS 204)
NOT Listed Only checkbox – We calculate the total number of citations
excluded by NOT List manually by subtracting the number in Queue
immediately after import from the total number of citations retrieved in
search strategies. For Jan 2014 this number was calculated to be 850.
Using the NOT Listed Only checkbox we got 1201. How is this number being
generated because in theory it should be 850, right?
Unpublished Only Checkbox – this seems to be working and is in fact
retrieving citations that are not listed, rejected or not yet reviewed
by med lib.
Something else we found in the process of testing these new
checkboxes…
In general the numbers generated in the citations reports should match
the numbers that we can generate using the checkboxes in the search the
dataset feature. This does not seem to be the case.
To get the statistics for “Total Citations Retrieved in Search
Strategies”, we generated the numbers from search the database by
checking the boxes “include unpublished”, “include not listed” and
“include rejected” for Sept 2014 review cycle (which comprises of new
data we imported for testing) and compared these numbers with the
“citations imported report”.
These numbers do not match. See below chart:
citations imported report Sept 2014 Search EBMS QA
Total Citations Retrieved in Search Strategies 350
Adult Treatment 179 254
Pediatric Treatment 72 76
Screening & Prevention 90 92
Cancer Genetics 55 59
(Note that I attached a Word version of the table because the formatted didn't keep when I pasted it above.)
I believe this is the same bug I described in my last comment posted to OCEEBMS-204. Let's see if fixing that resolves these discrepancies.
Cancer Genetics 55 59
The Citations Imported report looks for articles which got the "Ready for Initial Review" status. That would exclude the four articles in the batch which didn't get that status because they were in articles that were "not-listed" for the board, and which hadn't already been imported by other jobs.
Sorry it's taken me so long to address all of your comments above. You're raised some import questions, and I'd like very much to have us all come away satisfied with the answers, and understanding exactly how the system works.
Starting with the count of articles which were rejected because they were published in a journal which the board in question does not care to use: a total of 4,062 articles were "imported" for the January 2014 cycle (where "imported" in this context means that at least one topic was assigned to the article for that cycle). Of those articles, 1,198 were rejected for at least one of those topics because the board for that topic doesn't want to see articles published in the journals in which the articles appeared. That's the number I get for the "not-listed" articles for January when I submit a database query directly, and it's what I get when I do a search on QA for the January 2014 cycle with the "NOT LISTED ONLY" box checked, so I'm going to guess you were looking at the number for tier other than QA, since the number you reported was 1,201, or three higher than what I got on QA. I can think of two conditions which would cause that number to be higher than the result you would get if you were to subtract the number of articles in your queue after all of the import jobs for the cycle were done. One would be if there were articles in your queue which were left over from the previous cycle, still awaiting a decision from you. The other condition would be the presence of articles which were picked up by the NOT LISTED ONLY search because one of the topics selected for the articles was for a board which had the articles' journals on its "NOT" list, but other topics were assigned for a different board, which didn't want to automatically reject articles from those journals, so the articles would also show up in your queue anyway. (A variation on this second condition would be if some of the import jobs were done with the "NOT LIST" box checked, which would suppress the rejection based on the articles journals.) I don't have a way to see what the size of your queue was at the time you're describing. I assume you just wrote down the number back then, right?
For the second issue you described in that same comment, I have corrected the problem I noted in my September 15 comment, so the four Genetics articles which were missed by the "Citations Imported" report are now included, and both the report and the search show 59 for that board and the September cycle. However, there are still some discrepancies which I don't totally understand, and they appear to be caused by articles which are assigned a "Published" state for a topic without having been given any earlier states. For example, the search for Pediatric Treatment board articles for the September cycle returns 76 articles (as you show in the table you posted), but the "Citations Imported" report shows only 75 for that board/cycle combination. So we're closer, but still one off. The article which was missed by the report but found by the search is EBMS ID# 328390 ("Relationship between CYP1A1 polymorphisms and invasion and metastasis of breast cancer"; clearly this was done by a developer or member of the QA team who wasn't paying any attention to the actual relevance of the article to the topic/board selected, as the topic was "Childhood Brain & Spinal Cord Tumors" - the user was recorded as Test Board Manager). There's only one row in the state table for this article for Peds in September, and that's for the Published state. There are no rows for earlier states for this article/topic combination, which is why the "Citations Imported" report doesn't pick it up. Are there places other than the "Publish Citations" page where you can publish an article/topic combination?
Alan: Do you have any ideas about how this state row could be created without the earlier states being recorded for the article/topic combo? The only way for it to show up on the "Publish Citations" page is if the combination appears in the state table with a current state of "Ready for Initial Review" (as far as I can determine).
OK, I have figured out what's going on. If you bring up the "Full Citation" page for an article which is already in the EBMS, and you add a topic to the article (either for a board which already has another topic associated with the article, or a different board which you add yourself), then the article is put directly into the Published state, and it never gets a row in the state table for ReadyInitReview. I guess that makes sense, as the decision to add the topic there obviates the need for the initial review. So I need to know if such an article-topic combination should be reflected in the counts of articles shown in the "Citations Imported" report. On the face of it, this action doesn't seem very "import"-like. In the more granular Import Report we don't include an article which had been imported previously and for which a new topic was added in the "ARTICLES IMPORTED" count (they instead get picked up for the "DUPLICATE ARTICLES" count, as well as the "ARTICLES WITH TOPICS ADDED" count. On the other hand, we've already muddied the linguistic waters a little bit by having the assignment of new topics for existing articles happen in the context of an import, and we were including such articles in the less granular "Citations Imported" report's counts (see my earlier comment, where I wrote that "'imported' in this context [the 'Citations Imported' report] means that at least one topic was assigned to the article for that cycle"). Weighing in favor of reflecting articles in the "Citations Imported" report based on the addition of a new topic to the report on the "Full Citation" page would be the librarians' assumption that "In general the numbers generated in the citations reports should match the numbers that we can generate using the checkboxes in the search the dataset feature" (see earlier comment above). I guess I need to know whether the "In general ..." part of that sentence means that there are exceptions to the "should match" rule, and we wouldn't expect the numbers from the search and from the report to match in such cases. I can make the report behave either way. Just let me know what the consensus is.
This is a combined repy to your previous two comments.
Regarding count of articles rejected by NOT journals, your logic
makes sense to us. Regarding the conditions that can cause higher
numbers “One would be if there were articles in your queue which were
left over from the previous cycle” is not valid as the queue is always
empty before the importing begins. The number in queue is written down
as soon as the import is completed for the review cycle. These numbers
should be correct unless someone else imports a citation and not use the
tag “fast tracked”. The other condition seems valid. “The other
condition would be the presence of articles which were picked up by the
NOT LISTED ONLY search because one of the topics selected for the
articles was for a board which had the articles' journals on its "NOT"
list, but other topics were assigned for a different board, which didn't
want to automatically reject articles from those journals, so the
articles would also show up in your queue anyway.”
What this means for us is that we will be depending on the manual count
to have the exact number of citations rejected by NOT Journals rather
than search the database.
For the second issue, now you have figured out what is going on and we
understand why the search the database numbers may not match the numbers
in the reports.
The present “Citations imported report” is exactly what we want and we
would not like to add the other citations with topics added to that
report. So the bottom line is “we wouldn't expect the numbers from the
search and from the report to match”
What this means for us is that we will be depending on the manual count to have the exact number of citations rejected by NOT Journals rather than search the database.
It sounds like for your purposes "number of citations rejected by NOT Journals" means "number of articles for which ALL topics assigned during import were rejected by the NOT lists." But we still want to have the search return all article rejected by a NOT list for ANY topic assigned during import, right?
Have we covered all of your concerns, or did I miss any outstanding loose threads?
Minaxi and I agree that the search the database is fine as is. What it is doing makes sense to us especially when broken down by board. All other statistics except NOT journal count have a report as well. Perhaps we could have a report that would reflect the data that we want to have in the monthly report. So total number of citations NOTed out by NOT Lists at the time of import broken down by boards and then a grand total that reflects the number of unique citations completely NOTed out.
If you decide you want such a report, add a ticket to the backlog and we'll try to implement it for the next release.
Cynthia/Minaxi,
Can we move this issue to QA Verified? It looks like everything has been addressed in the comments (aside from a possible new report in a future release, which will have its own ticket), but I just wanted to be sure before we close this out.
Thanks,
Robin
Yes, I think we have sorted this issue out. we can work on the report as a separate issue.
OK, great. Thank you all for the thorough testing and troubleshooting! I'm marking this verified on QA.
Cynthia/Minaxi, please enter a new issue for the report whenever you're ready.
Verified on Prod.
File Name | Posted | User |
---|---|---|
OCEEBMS-203.doc | 2014-09-12 10:34:55 | Shields, Victoria (NIH/NCI) [E] |
Elapsed: 0:00:00.000871