EBMS Tickets

Issue Number 613
Summary Discuss streamlining import action assignments with the users
Created 2021-09-29 15:38:57
Issue Type Inquiry
Submitted By Kline, Bob (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2022-12-12 10:01:38
Resolution Fixed
Path /home/bkline/backups/jira/oceebms/issue.299694
Description

We may want to consider only recording the "import" of an article as "duplicate" if we're not assigning a new topic for the article, and we're not refreshing changed data with new values from NLM. The current use of "duplicate" is confusing when we have the article assigned to one topic, and it's being added to an import job so that it gets assigned to a second topic (possibly also getting its values updated).

Comment entered 2021-10-01 11:49:19 by Kline, Bob (NIH/NCI) [C]

Also, it seems odd to included articles which were rejected by the board's "NOT" list in the XX ARTICLES READY FOR REVIEW block, since not-listed is a terminal state.

Comment entered 2022-02-09 10:23:15 by Boggess, Cynthia (NIH/NCI) [C]

I agree. I can see how other users could find these import counts confusing. If we are adding a citation to the database and it is already in the database then technically yes it is a duplicate but because we are possibly assigning new data to that citation we don't necessarily have to call it or treat it as a duplicate. For our purposes, a  true "duplicate" would be only if the citation and summary topic(s) are the same. 

When we do imports it is important for us to know how many of the citations are already in the database regardless of whether we are assigning new topics. This is currently what we are calling "duplicates" in the import report. We also need to know how many citations we imported were updated with new summary topics which we also have a count for in the import report. 

I have no objections to adding a count that reflects true "duplicates" but I don't want to replace or remove the duplicate count we already have because we use it...maybe we need to rename the counts.

Regarding the NOT listed, I agree having them in the articles ready for review count is not helpful.

Comment entered 2022-03-15 15:32:49 by Kline, Bob (NIH/NCI) [C]

How about if we

  • record an article as "imported" if this is the first topic; or

  • record the article as "topic added" if this is a new topic, but not the first; or

  • record the article as "duplicate" if we already have the topic?

That way we still have the number which was labeled in the original system as "duplicate" (by adding the "topic added" + "duplicate" counts).

Comment entered 2022-03-17 16:14:56 by Boggess, Cynthia (NIH/NCI) [C]

This is correct and would clarify things. But now I have to do more math 🙂

Comment entered 2022-03-18 09:16:41 by Kline, Bob (NIH/NCI) [C]

You're probably just joking, but if not I can always display another count labeled "Topic added + Duplicate" 🙂

Comment entered 2022-03-18 09:39:05 by Boggess, Cynthia (NIH/NCI) [C]

Yes, I was joking. This won't be an issue for myself or Jeff, but good to know it is an option if down the road new users have an issue with it.

Comment entered 2022-12-12 10:01:38 by Kline, Bob (NIH/NCI) [C]

Implemented in EBMS4.

Comment entered 2023-01-05 17:57:31 by Boggess, Cynthia (NIH/NCI) [C]

This is tricky to test but I am pretty sure this is working correctly on ebms4.

I have verified that the new distinctions outlined in this ticket are being made on ebms4 but we are going to have to assume for now the counts are correct. They look like they could be correct, but I cannot say for sure because the import reports are understandably slightly different between prod and ebms4. This is primarily due to the order of imports imported in a given review cycle.

Note: I used the adult treatment prostate cancer import report from Jan 2023 recently imported in both prod and ebms4 to test.

Elapsed: 0:00:00.000777