Issue Number | 313 |
---|---|
Summary | Beef up EBMS Import error management |
Created | 2015-08-20 18:14:46 |
Issue Type | Improvement |
Submitted By | alan |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2016-09-09 14:02:02 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/oceebms/issue.168182 |
A number of recent import errors have revealed some limitations in the error management in the EBMS import software. This issue is to improve that error management in light of experience. Some things to do include:
Increase the number of characters saved from an error
message.
The current limit of 250 seemed generous at the time but 500 or so may
be needed to give us complete information. We could make it larger still
if we alter the ebms_import_batch.messages column, or create a separate
table for error messages, to accommodate more.
Save the HTTP error code.
Ensure that I've saved enough info about the request that failed that I can reconstruct it.
Review the transaction management to be sure that, if an entire import is cancelled and not saved in the database, we still save enough to know what happened in the error. This requires a review of our policy on what to do if part of an import fails and part succeeds.
Consider leaving messages elsewhere in addition to the ebms_import_batch messages column, i.e., the Drupal watchdog table.
Done on DEV. Not a user-testable issue.
✔ Increase the number of characters saved from an error message
✔ Save the HTTP error code
✔ Ensure that we've saved enough info about the request that failed that
we can reconstruct it
✔ Review the transaction management
✔ Leave messages elsewhere (debug log, watchdog)
I've added ~TrivediM and ~BoggessC as watchers to this ticket. For one thing, while it's true (as I wrote in the previous comment) that the users won't be able to do much (if anything) to verify that I did what Alan created this ticket to do (enhance what we capture when a mysterious import error occurs), the librarians will probably want to do at least a little checking to make sure I haven't broken import horribly. In addition, I wanted to point out that one of the things I'm doing is capturing each of the Pubmed Files submitted to the software for extracting IDs. I'm not sure how much disk space this will chew up, or if that could pose a problem, but I wondered if the librarians might consider using the PMID list format, which would take up much less space. I have a sneaky suspicion, though, that the answer might be "no" (they need the information in the PUBMED format). Had to ask, though. :-)
~TrivediM and ~BoggessC: have you had an opportunity to think about the question in the previous comment?
Thanks,
Bob
Wasn't able to get a response from the librarians, so I have rewritten the capturing of the Pubmed results files so they're stored in the database instead of the file system. This may have an impact on OCEEBMS-392, but probably not enough to worry about. The column in which I'm storing the Pubmed results responses has a limit of 16-17 megabytes, but I expect all of the files submitted to our import software to be well under that limit.
I imported a large file in Dev and it imported without any error, so nothing is broken.
Regarding your question to consider using the PMID list format for importing, Cynthia and I do not see any reason to say NO. All this time, we used Medline format because that was required and it helped us to look at the details of the citation instantly if required.
OK, thanks. It's too late to get that into this release, so we'll save it for the next one. It should make the import process more robust, with less parsing to do, and smaller file sizes to deal with.
Elapsed: 0:00:00.000583