PDQ Issues

Issue Number	288
Summary	Import Report does not display correct numbers
Created	2015-06-03 20:26:31
Issue Type	Bug
Submitted By	trivedim
Assigned To	alan
Status	Closed
Resolved	2015-08-14 15:17:33
Resolution	Won't Fix
Path	/home/bkline/backups/jira/oceebms/issue.162509

Description

I would like to report the problem of Import Report that occurred two times while importing citations today.
After importing Non-Small Cell Lung Cancer file for Adult treatment, I noticed that the import report numbers did not look correct and little strange. The report showed:
144 unique in batch
17 imported
0 not listed
127 duplicate
18 ready for review
1 topic added
0 replaced
0 errors

Since the numbers did not look convincing, I checked the number of citations appearing in Medical Librarian's queue and there are 112 citations. I searched the database to see how many were not listed and how many were imported. My Text file has 144 citations, but the EBMS search indicated that 145 citations are imported and 33 citations are NOT listed. So the numbers in the report are not correct but the citations seemed to be imported although with one extra.

I continued to import and all was well until I reached Childhood HCT. And again the import report was incorrect with strange numbers as follows:

75 unique in batch
0 imported
0 not listed
75 duplicate
0 ready for review
0 topic added
0 replaced
0 errors

The Medical Librarian's queue shows 68 citations for the topic and Search the database shows 76 citations imported, one extra than the number in my text file and 8 NOT listed.

I am attaching both the text files.

I have decided to stop importing. Please let me know if I should continue with importing or wait.

Comment entered 2015-06-04 11:08:47 by alan

I'm investigating.

Comment entered 2015-06-04 12:12:14 by alan

I ran into a problem trying to read the tables in the database that contain the import information. Our CBIIT database administrators have blocked read-only access to them.

I've created a ticket with CBIIT to ask them to grant access and marked it as a "Blocker". I've asked them to fix it by 3 pm today, which is the end of our CDR/EBMS status meeting. Hopefully I'll be able to dig in this afternoon.

Comment entered 2015-06-04 13:12:32 by alan

CBIIT has acted very promptly and I now have access to everything. I'll get back to work on it.

Comment entered 2015-06-04 13:33:14 by alan

Minaxi,

I have to leave for the status meeting but I'll ask quickly about the Childhood HCT, what did you do specifically to see the numbers 76 and 8. I'm seeing 75 and 7 in the import batch report for batch 6598.

Thanks.

Comment entered 2015-06-04 14:26:26 by trivedim

Comment entered 2015-06-04 14:31:39 by trivedim

I Searched the database for June 2015 review cycle, Childooh HCT summary topic and from Administrative search selected Include Unpublished, which gave me 76 citations. When I selected NOT listed, I got 8 citations.
Right now when I tried the exact same search, I got 76 and 7 citations.

Comment entered 2015-06-04 16:21:51 by alan

Minaxi,

I've looked at the Childhood HCT issue. Here's what I've found:

The extra PMID in the search result that was not in the import batch report is 25925135. That article appears to have been imported in an Import Special Search on May 8, 2015 for the May cycle. I looked at the Full Citation page for that article and see that it is listed in both the May and June cycles. That June cycle s what caused it to appear in the Search that you did along with the 75 records imported in the main June cycle import. However I see no record of the article having been imported again in June. Is there something someone could have done to cause it to also appear in the June Cycle? It did have two more topics added on May 15 by Sharon QK.

As for the not list count, I'm hoping that you simply misread or misremembered the search results the first time and that 7 is the real number. There were two other counts of 18 and perhaps you remembered the 8 from one of them? In any case, both the search results and the import batch report currently say 7 is the right number.

So I'm thinking that, unless there is no way to account for the article appearing in two review cycles, this one is benign.

Let me know if you think this is mistaken. In the meantime I'll start looking at the other problem.

Comment entered 2015-06-04 16:35:58 by alan

I found the cycle change on the Childhood HCT article (PMID 25925135.) The original import was in the May cycle for Childhood Leukemia. When the other topics were added on May 15, the June cycle was assigned to them.

So I think the only issue is the 7 / 8 not list dichotomy - which I'm hoping is a simple misremembering.

Comment entered 2015-06-04 16:42:03 by trivedim

Hi Alan,
Regarding the extra PMID, I am assuming that when Sharon added the extra topics, she must have used the default June 2015 review cycle and that is why it shows both May and June review cycles and that explains why it is retrieved in June 2015 search.
For the NOT list count, I don't think I misread it, but now since it shows correct number, we should not worry about it.
Minaxi

Comment entered 2015-06-04 18:17:32 by alan

I'm looking at the non-small cell lung cancer search.

First some bad news.

This one looks like a real problem. There may be a bug.

Two separate EBMS articles were created, 392403 and 392404, both for the same Pubmed ID (24365049). Somehow, the first "article state" (Ready for Initial Review) was created for the higher number 392404 at 2:14:40 pm on June3. 12 seconds later the first article state (also Ready for Initial Review) was created at 2:14:52 pm for 392403. The times are misleading because an import batch assigns the same time to all records in a batch since they are all, in a real sense, imported at once, but something happened to reset the timestamp.

I don't know how to account for this. I got a list of all of the events that occurred during that period sorted by time, and I also checked the article table for articles created at that time. In addition to the two records created from the same PMID, there is also one missing article_id. There is no record number 392406, although all the numbers around that are used. I don't know yet if there are any missing PMIDs.

There may be a bug in the program, or perhaps there was some transitory glitch in the database or the network that briefly stalled the system and caused the import program to stop and restart or abort a transaction, reprocessing one of the records and possibly aborting another.

Now some good news.

I queried the database to find out how many Pubmed IDs appear more than once in the article table. It turns out that this is the ONLY one.
The fact that the problem only occurred once gives some credence to the transitory glitch theory. If it was a bug, why hasn't it happened more than once in all of the imports we've done?
The problem is disconcerting but maybe not dangerous. I can't think of any way that having two copies of the same record in the database will cause anything to break - though it's certainly not something we want to happen.
I will do some more research, but I think it's safe to resume importing records.in the meantime.
Minaxi must be a wizard to have spotted this very non-obvious problem that has happened once in 392,404 records!

Comment entered 2015-06-04 21:18:28 by alan

There was a "database failure" recorded in the watchdog table (wid=61024) that is relevant to what happened. An integrity constraint was violated - attempting to create a duplicate entry for the primary key "392402-79". The watchdog entries are abbreviated but it appears to originate on line 1326 of EbmsImport.inc, attempting to insert a record into the ebms_article_topic table. I think this is probably an effect of a failure earlier in the chain of events rather than a cause.

Maybe our transaction management isn't handling this very well. This particular error is trapped and does not trigger a rollback. The comment explaining this (surely written by me) says:

  // It is conceivable that someone did the same thing
  //  at the same time.  If it's a whole database problem
  //  other things will fail, so don't worry about that.

That may have been over-optimistic. Maybe whatever else went wrong didn't trigger an exception and rollback, or maybe this is at too low a level and our transaction management strategy is splitting hairs when it should have just killed the entire import job.

That still doesn't tell us why something went wrong but might suggest a different strategy for handling such errors - which may or may not ever occur again (only happened once in several years.)

I'll do a little more nosing around on a lead or two and then I think Bob and I should discuss how much, if any, more work should be done on this.

Comment entered 2015-06-04 21:23:58 by alan

It looks from the watchdog message, and also from the ebms_article_state table, that something went wrong processing the record before the one that created the duplicate records. EBMS ID 392402 generated the integrity constraint failure. Interestingly, the row with the earlier date for Ready for Initial Review is marked as "current" while the one stamped 12 seconds later is marked as not current.

What does that mean? I don't know. I'm just trying to accumulate clues here in hopes that, if there are enough of them, the jigsaw puzzle will be full enough to reveal the scene.

I think I'll call it a night on this issue unless I get a brainstorm later.

Comment entered 2015-06-05 11:56:41 by trivedim

i have been able to import the remaining citations without problem.

Comment entered 2015-06-09 22:48:56 by alan

We may want to do some more programming related to this to cause an error of this type to rollback an entire batch import job and have it re-imported.

In the meantime I suggest that a board manager pick one of the duplicate documents (say 392404) and mark it as rejected with a comment like:

"Duplicate import caused by a possible software error, using EBMS ID 392403 instead. See JIRA issue OCEEBMS-288."

Comment entered 2015-08-13 21:31:14 by alan

I think there is probably a design flaw in the transaction processing error management that could be fixed. My theory is that an unusual, transient, unreproducible, external (network, database, or etc.) error occurred and our error management could have recovered better than it did, without creating a duplicate record. However, the problem only occurred once in hundreds of thousands of imported records and did no significant harm when it did happen. All it did was create a duplicate record. So I'm not sure it's worth the time and testing to try to fix it.

Would it be reasonable to close this issue and mark it "Resolved as Won't Fix"?

Comment entered 2015-08-14 15:17:10 by Juthe, Robin (NIH/NCI) [E]

Yes, I think that's reasonable. We can always reopen the issue if it starts happening more frequently.

Comment entered 2015-08-14 15:22:47 by Kline, Bob (NIH/NCI) [C]

Just out of curiosity, how was the software able to store to rows in the article table with the same primary key?

Comment entered 2015-08-18 10:51:01 by alan

The primary keys (ebms article_id) were different. But two articles were created with the same pubmed ID, which shouldn't have happened. That was the error.

Comment entered 2015-08-31 15:34:36 by trivedim

The same problem has occred today while I imported a single citation PMID: 26304884. The import report showed :
1 UNIQUE ID IN BATCH
26304884 (404792)
0 articles imported
0 articles not listed
1 duplicate article
26304884 (404792)
Actually it imported one citation and it was not a duplicate article.

Attachments

File Name	Posted	User
Ch_HCT_June15.txt	2015-06-03 20:26:31
non-small_cell_June15.txt	2015-06-03 20:26:31

Elapsed: 0:00:00.000739

EBMS Tickets