Issue Number | 307 |
---|---|
Summary | Error while importing to the EBMS |
Created | 2015-08-04 16:11:42 |
Issue Type | Bug |
Submitted By | trivedim |
Assigned To | alan |
Status | Closed |
Resolved | 2015-08-04 19:52:55 |
Resolution | Won't Fix |
Path | /home/bkline/backups/jira/oceebms/issue.166738 |
While importing citations for August 2015 review cycle, I got the
attached error two times. First time I got the error when I had just
begun importing.
I was able to import one file Adrenocortical Carcinoma with 7 citations
(one was duplicate). While importing the next file Adult ALL, I got the
attached message. I logged out and logged in again and could import
about 32 files. Now I have got the error again when I tried to import
Head and Neck Cancer file. Please let me know if it is safe to import
after logging off and logging in again.
Alan:
Please take a look at this.
I'll look at it tonight and post something about what I found or didn't find.
It appears that there were three failures today. As near as I can tell, all of them were errors originating at NLM. There were two 503 "Service Unavailable" and one 502 "Bad Gateway" errors. I presume these were caused by transient problems at NLM with their webserver, database, network, or some other component problem.
At our end, each of these errors occurred at the start of each import. No articles were imported and nothing went wrong in our database. No partial or corrupt article records were stored. Even if we had imported some articles and failed on others, everything should still be okay in our database. The articles imported successfully would be stored. If a mangled article came in (very unlikely, I think these transactions either succeed or fail), it would almost certainly result in the mangled article being discarded (due to an xml parse error.) If the search were re-run, the articles that were successfully imported in the job that got interrupted would come in again but simply be marked as duplicates, with no harm done, and any that were missed in the first import would come in in the second.
Therefore, I think it's perfectly safe to go on importing. If this happens again I would suggest waiting a few minutes to give NLM a chance to recognize and fix any problems at their end, and then try again. If it keeps happening, I'd wait longer than a few minutes. But whatever we do, I don't believe these kinds of errors will cause any problems other than inconvenience at our end.
I'm resolving this as "Won't Fix" because I think the fixing needs to be done at NLM, not at our end. They're pretty reliable there. I'm optimistic that they've already fixed their problem and that we won't see the problems tomorrow.
I just got this error too, when trying to import a single citation. I received the error 4-5 times, logged out and logged back in, and it worked. Just wanted to document that it's still happening, but it's sporadic and seems to be resolving itself relatively quickly.
Oddly, I only see one error in the log file for production from today, at 11:39 am. Were all of the attempts made in PROD?
As before, it was a "Service unavailable" error coming from NLM. In theory your logout and login shouldn't have had any effect on this and it was just coincidence that NLM resolved the problem sometime just before your re-login.
I'll do a little noodling around at NLM to see if they provide any tracking info for when they've had problems.
Yes, all attempts were on PROD in relatively quick succession (over the next few minutes) from the same page. I guess it's good to know that hitting "SUBMIT" again after a failure doesn't act as a new submission.
I've sent a message to the Pubmed Help desk asking if they had an outage at 11:39:47 today. Hopefully they have a record they can use to confirm or deny that the problem was at their end.
I think Pubmed has an automatic failover to a backup site, just as we do for cancer.gov. I've also asked them about that and asked how long their failover takes.
I'll post the answers if I get any.
As I think I mentioned in the status meeting, I did get an answer. The Help desk person was not aware of any outages. She asked for more information about it and I sent what we have.
Our interaction with Pubmed is through a front end "eutilities" webservice that, in turn, talks to the Pubmed database. It's possible that the error originated in Pubmed but in the eutilities service instead of the Pubmed search and retrieval system itself.
It will be reassuring if NLM is able to find that there was indeed an error today. However, I'm going to create another issue for improving the error management in the EBMS. The scope of it is a little larger than would seem to fit in just addressing these particular errors - which I still think are just transient errors from NLM.
File Name | Posted | User |
---|---|---|
EBMS error Aug 4, 2015.docx | 2015-08-04 16:11:42 | |
screenshot-1.png | 2015-08-20 11:43:24 | Juthe, Robin (NIH/NCI) [E] |
Elapsed: 0:00:00.000812