EBMS Tickets

Issue Number 313
Summary Beef up EBMS Import error management
Created 2015-08-20 18:14:46
Issue Type Improvement
Submitted By alan
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2016-09-09 14:02:02
Resolution Fixed
Path /home/bkline/backups/jira/oceebms/issue.168182
Description

A number of recent import errors have revealed some limitations in the error management in the EBMS import software. This issue is to improve that error management in light of experience. Some things to do include:

  • Increase the number of characters saved from an error message.
    The current limit of 250 seemed generous at the time but 500 or so may be needed to give us complete information. We could make it larger still if we alter the ebms_import_batch.messages column, or create a separate table for error messages, to accommodate more.

  • Save the HTTP error code.

  • Ensure that I've saved enough info about the request that failed that I can reconstruct it.

  • Review the transaction management to be sure that, if an entire import is cancelled and not saved in the database, we still save enough to know what happened in the error. This requires a review of our policy on what to do if part of an import fails and part succeeds.

  • Consider leaving messages elsewhere in addition to the ebms_import_batch messages column, i.e., the Drupal watchdog table.

Comment entered 2016-09-08 18:22:28 by Kline, Bob (NIH/NCI) [C]

Done on DEV. Not a user-testable issue.

✔ Increase the number of characters saved from an error message
✔ Save the HTTP error code
✔ Ensure that we've saved enough info about the request that failed that we can reconstruct it
✔ Review the transaction management
✔ Leave messages elsewhere (debug log, watchdog)

Comment entered 2016-09-08 18:40:55 by Kline, Bob (NIH/NCI) [C]

I've added and as watchers to this ticket. For one thing, while it's true (as I wrote in the previous comment) that the users won't be able to do much (if anything) to verify that I did what Alan created this ticket to do (enhance what we capture when a mysterious import error occurs), the librarians will probably want to do at least a little checking to make sure I haven't broken import horribly. In addition, I wanted to point out that one of the things I'm doing is capturing each of the Pubmed Files submitted to the software for extracting IDs. I'm not sure how much disk space this will chew up, or if that could pose a problem, but I wondered if the librarians might consider using the PMID list format, which would take up much less space. I have a sneaky suspicion, though, that the answer might be "no" (they need the information in the PUBMED format). Had to ask, though. :-)

Comment entered 2016-09-14 08:46:23 by Kline, Bob (NIH/NCI) [C]

and : have you had an opportunity to think about the question in the previous comment?

Thanks,
Bob

Comment entered 2016-09-20 11:11:59 by Kline, Bob (NIH/NCI) [C]

Wasn't able to get a response from the librarians, so I have rewritten the capturing of the Pubmed results files so they're stored in the database instead of the file system. This may have an impact on OCEEBMS-392, but probably not enough to worry about. The column in which I'm storing the Pubmed results responses has a limit of 16-17 megabytes, but I expect all of the files submitted to our import software to be well under that limit.

Comment entered 2016-09-26 18:09:57 by trivedim

I imported a large file in Dev and it imported without any error, so nothing is broken.

Regarding your question to consider using the PMID list format for importing, Cynthia and I do not see any reason to say NO. All this time, we used Medline format because that was required and it helped us to look at the details of the citation instantly if required.

Comment entered 2016-09-27 08:08:46 by Kline, Bob (NIH/NCI) [C]

OK, thanks. It's too late to get that into this release, so we'll save it for the next one. It should make the import process more robust, with less parsing to do, and smaller file sizes to deal with.

Elapsed: 0:00:00.000583