CDR Tickets

Issue Number 3057
Summary [CiteMs] Citations import error
Created 2010-01-05 14:12:06
Issue Type Improvement
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To priced
Status Closed
Resolved 2010-01-06 17:01:44
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107385
Description

BZISSUE::4733
BZDATETIME::2010-01-05 14:12:06
BZCREATOR::William Osei-Poku
BZASSIGNEE::Brent Carter
BZQACONTACT::William Osei-Poku

Problem as reported by Cynthia:

I tried to import a citation this morning and got an error message something like…pmid not found. And the following errors made it seems as though the citation was not imported but in fact a record was created but it only contained the sumtop, pmid and cmsid. All the bibliographic data was blank. Minaxi mentioned recent problems with importing citations to the CDR due to recent NLM end of the year changes. I bet this is also the reason the CMS is having problems importing citations as well.
I am attaching the file I used. The record created is cmsid=196891, pmid=20038717
We will need this fixed soon because we are due to start importing citations for the Jan 2010 review cycle early next week.

Note:
I have attached all the relevant emails. I will be adding the current email from Brent, Cynthia and Minaxi in the next comment.

Comment entered 2010-01-05 14:12:06 by Osei-Poku, William (NIH/NCI) [C]

Attachment RE CMS citation Importing problem.txt has been added with description: Email exchanges

Comment entered 2010-01-05 14:14:01 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-05 14:14:01
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::1

-Email from Brent-

Hi All,

Pete got back to me and he was pretty confident that the plan we had to change the "S" to "M" is the correct action to prevent the error we're seeing, and that there shouldn't be any additional work that needs to be done (i.e. my worry about data structures or program logic that may need updating). It is great to have this confirmation from someone with experience with the system. If other NLM changes cause issues we may have to revisit this question.

He has also helped with our second issue regarding how to remove the partially imported records. Luckily, he was able to get in touch with another Lockheed person who also worked on CiteMS and she provided a re-usable script that can safely delete a single or range of records. This script has one required step and then one additional step if the citation has already gone through review. I assume the second step is not needed in this case but can someone confirm the citation(s) we want to remove have not yet gone through review?

Below are the next steps I think we should take and the sample delete script. We should store this script somewhere for future reference. Would it make sense to create a Bugzilla bug for this issue and put it in there?

Thanks,
Brent.

Comment entered 2010-01-05 14:15:16 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-05 14:15:16
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::2

Emails from Cynthia and Minaxi-

Correction: CMSID 196891-196903 are to be deleted.

---Original Message---
From: Trivedi, Minaxi
Sent: Tuesday, January 05, 2010 1:30 PM
To: Carter, Brent (NIH/NCI) [C]; Grama, Lakshmi (NIH/NCI) [E]; Boggess, Cynthia; Osei-Poku, William
Cc: Kline, Robert (NCI); 'Alan Meyer'; Safarnejad, Reza (NIH/NCI) [C]
Subject: RE: CMS citation Importing problem

Hi All,

CDR ID 196891 to 196903 are to be deleted. These citations have not yet gone through the review.

Thanks,
Minaxi

-----

Comment entered 2010-01-05 15:11:10 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-01-05 15:11:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::3

Here are two PMIDs for records which had the CommentsCorrections structure changes in the XML version of the documents:

19620164
19689789

It would be good to have the users import at least one of these to make sure that the CiteMS system wasn't affected.

Comment entered 2010-01-05 16:54:00 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-05 16:54:00
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::4

(In reply to comment #3)
> Here are two PMIDs for records which had the CommentsCorrections structure
> changes in the XML version of the documents:
>
> 19620164
> 19689789
>
> It would be good to have the users import at least one of these to make sure
> that the CiteMS system wasn't affected.

From Cynthia:

As I expected, when I imported this citation the same pmid not found error occurs because of the 2 IS fields . The IS field issue needs to be fixed before we can test the comments correction block. The import filter reads line by line from the top of the citation record to the bottom. It is getting hung up on the multiple IS fields and is not reading the rest of the record.
We will need to ask that this citation that I just imported be deleted as well and then we can use it after the IS issue is fixed to test this again.

This Cmsid=196904 needs to be deleted.

The second citation is already imported into the CMS.

Comment entered 2010-01-05 16:59:19 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-01-05 16:59:19
BZCOMMENTOR::Bob Kline
BZCOMMENT::5

(In reply to comment #4)
> (In reply to comment #3)
> > Here are two PMIDs for records which had the CommentsCorrections structure
> > changes in the XML version of the documents:
> >
> > 19620164
> > 19689789
> >
> > It would be good to have the users import at least one of these to make sure
> > that the CiteMS system wasn't affected.
>
> From Cynthia:
>
> As I expected, when I imported this citation the same pmid not found error
> occurs because of the 2 IS fields . The IS field issue needs to be fixed before
> we can test the comments correction block. The import filter reads line by line
> from the top of the citation record to the bottom. It is getting hung up on the
> multiple IS fields and is not reading the rest of the record.
> We will need to ask that this citation that I just imported be deleted as well
> and then we can use it after the IS issue is fixed to test this again.
>
> This Cmsid=196904 needs to be deleted.
>
> The second citation is already imported into the CMS.

Right. I should have said that I was hoping she'd do the testing in the test system. Please make sure she tests again on the system where Brent has addressed the IS field issue.

Comment entered 2010-01-06 08:11:06 by priced

BZDATETIME::2010-01-06 08:11:06
BZCOMMENTOR::Brent Carter
BZCOMMENT::6

Previous developers on the CiteMS system confirmed there is no way to delete citations from within the software itself and provided the following script that can be used to delete citations (of course the ID values need to change):

delete from asc_ORIGINAL_SUMMARIES where ref_id between 183223 and 183229

delete from asc_CITE_SUMMARIES where ref_id between 183223 and 183229

delete from bib where ref_id between 183223 and 183229

Extra steps if the citation already go through review:

delete from asc_SEARCH_TOPICS where review_id in(select review_id from mt_review where ref_id in(169300,166208,164997,121949,177813,177919,160484))

delete from mt_review where ref_id in(169300,166208,164997,121949,177813,177919,160484)

Comment entered 2010-01-06 09:14:02 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-01-06 09:14:02
BZCOMMENTOR::Bob Kline
BZCOMMENT::7

Brent:

Any idea how the IDs in the last two queries are derived? We might want to figure out (perhaps in consultation with the original developers) if it's possible to take a little bit more of the manual intervention out of the deletion process by plugging in the IDs from the first queries into a sub-sub-select. And maybe we could even wrap the whole script up into a stored procedure. I understand you didn't need to mess with the final two queries this time, since the citations we're dealing with right now hadn't made it to the review stage, but if I understand correctly what the queries are doing, and if it makes sense to add the sub-sub-selects to the last two queries, those two should be harmless no-ops if no review has taken place for the citation being deleted. Doesn't have to happen as part of this task (I realize you're jamming on other work), but I just wanted to see if you think this idea is worth pursuing at some point.

Comment entered 2010-01-06 09:48:33 by priced

BZDATETIME::2010-01-06 09:48:33
BZCOMMENTOR::Brent Carter
BZCOMMENT::8

I didn't spend much time looking into it but my suspicion is that this is all just based on a list of input ref_id values. I'm betting the entire example script they gave us was never actually run as is. I think maybe they found more than one old script. The first part of the script was found from a time when that was all they had to do. Then, they remembered the optional second step and copied and pasted that part from some other old script that had a different set of ref_id inputs. Looking at some of the DDL used to create the DB seems to confirm this theory since there is a direct relationship from the tables in the second step to the "bib" table in the first based on ref_id:

ALTER TABLE [dbo].[mt_REVIEW] WITH NOCHECK ADD
CONSTRAINT [FK_mt_REVIEWref_i_00AA174D] FOREIGN KEY([ref_id])
REFERENCES [dbo].[bib] ([Ref_ID])

ALTER TABLE [dbo].[asc_SEARCH_TOPICS] WITH NOCHECK ADD
CONSTRAINT [FK_asc_SEARCrevie_12C8C788] FOREIGN KEY([review_id])
REFERENCES [dbo].[mt_REVIEW] ([review_id])

Given this, I'd think it would be pretty easy to wrap up in a stored procedure as you suggest. The only challenge might be making it general (e.g. it is nice to allow a range input which covers deleting a single or range of values). But we might also need to delete a set of random values (as they do in the second step by using an "in"). We wouldn't want to use "in" all the time since if we have a large range that would be ugly to input all the individual values... but I'm sure something could be done.

As for if I think it is a good idea, absolutely yes! Especially if we expect this to happen more in the future and we don't expect to re-write the app any time soon. Depending on those factors we may even consider adding a delete function to the software itself (based on the procedure) so admin users can do this themselves. Like you said, the more manual intervention we can remove the better.

Comment entered 2010-01-06 10:29:58 by priced

BZDATETIME::2010-01-06 10:29:58
BZCOMMENTOR::Brent Carter
BZCOMMENT::9

Confirmed with Pete, the previous developer. From him:

I’m sure that the statements were just pieced together. The ref_id (or set of ref_ids) will be the same for all statements.

Comment entered 2010-01-06 11:48:38 by priced

BZDATETIME::2010-01-06 11:48:38
BZCOMMENTOR::Brent Carter
BZCOMMENT::10

Here is the actual script run on prod CiteMS:

use NCI_CMS
go

    • Fix the issue where IS can now occur multiple times.
      update mt_ImportDef set TagType = 'M' where ID = 551

    • Remove the bad citation records.
      declare @FirstRefId int
      declare @LastRefId int

set @FirstRefId = 196891
set @LastRefId = 196904

begin transaction
delete from asc_ORIGINAL_SUMMARIES where ref_id between @FirstRefId and @LastRefId
delete from asc_CITE_SUMMARIES where ref_id between @FirstRefId and @LastRefId
delete from bib where ref_id between @FirstRefId and @LastRefId
commit transaction

/*

Output:

(1 row(s) affected)

(15 row(s) affected)

(15 row(s) affected)

(14 row(s) affected)

*/

Comment entered 2010-01-06 16:59:55 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-06 16:59:55
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::11

Added additional email exchanges.

Comment entered 2010-01-06 16:59:55 by Osei-Poku, William (NIH/NCI) [C]

Attachment RE CMS citation Importing problem1.txt has been added with description: Email exchanges1

Comment entered 2010-01-06 17:00:44 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-06 17:00:44
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::12

Marked issue as "RESOLVED" in preparation for closing.

Comment entered 2010-01-06 17:01:44 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-06 17:01:44
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::13

(In reply to comment #12)
> Marked issue as "RESOLVED" in preparation for closing.

This issue has been resolved. It is now closed. Thank you!

Attachments
File Name Posted User
RE CMS citation Importing problem.txt 2010-01-05 14:12:06 Osei-Poku, William (NIH/NCI) [C]
RE CMS citation Importing problem1.txt 2010-01-06 16:59:55 Osei-Poku, William (NIH/NCI) [C]

Elapsed: 0:00:00.001723