CDR Tickets

Issue Number 3336
Summary [CiteMS] Publishing Problem with CiteMS
Created 2011-04-01 15:39:07
Issue Type Improvement
Submitted By Boggess, Cynthia (NIH/NCI) [C]
Assigned To alan
Status Closed
Resolved 2011-04-15 11:02:57
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107664
Description

BZISSUE::5029
BZDATETIME::2011-04-01 15:39:07
BZCREATOR::Cynthia Boggess
BZASSIGNEE::Alan Meyer
BZQACONTACT::Cynthia Boggess

We are experiencing a publishing problem with the CiteMS. Minaxi imported 2 citations (CMSID 231943 & 231944)today which are not appearing on the list of ciations ready to be published (http://citems.nci.nih.gov/StaffImport.asp?im_ed=1&im_rc=126). They are both adult treatment citations for the March 2011 review cycle.
I imported the two citations into the test database sucessfully and they appeared on the list of citations ready to be published on the test site. So the citations are not the problem. My guess is that there is a bug with the recognition of the citation being ready to publish.
Minaxi was able to publish citations successfully on Wednesday March 30th.

Comment entered 2011-04-01 15:53:25 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2011-04-01 15:53:25
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::1

I set the Priority to a P4, assigned the QA contact to Cynthia and added Robin to the cc list.

Comment entered 2011-04-05 22:11:37 by alan

BZDATETIME::2011-04-05 22:11:37
BZCOMMENTOR::Alan Meyer
BZCOMMENT::2

I've been working on this all evening. I haven't found anything
wrong with the application yet but after some hours of noodling
around I logged onto the actual machine that hosts the databases
and the web based software. While there a low disk space warning
popped up. It is really low, under 300,000 bytes left. That's
low enough that I'm wondering if the problem is related to it.

I'm going to make an emergency request to the system
administrators to allocate more disk space ASAP!

In the meantime, I recommend that NO activity take place in any
version of the CiteMS.

I'm adding Bob Kline and Volker Englisch to the CC list for this
bug since this problem seems to be popping up in multiple places.
It looks like the software that is supposed to monitor disk space
on our servers is having problems everywhere. I'm also adding
Robin Harrison so she'll know that there are problems going on
with the system.

More to come ...

Comment entered 2011-04-05 23:42:28 by alan

BZDATETIME::2011-04-05 23:42:28
BZCOMMENTOR::Alan Meyer
BZCOMMENT::3

I freed up some disk space and posted messages that I hope
weren't too rude to the systems administrators.

I think the first thing we need to do now is for Cynthia to go
back into the CiteMS and see if everything or anything that was
broken now works. I don't really expect it to, but I fear that
there's a lot of work ahead and I don't want to start it without
first checking to see if we could be incredibly lucky. Also test
OCECDR-3337.

Assuming it doesn't work, I'd like Cynthia to document for me
exactly what steps to take to see the problem - I presume that
would be to attempt to accept or reject one of the two indicated
citations. If I need to logon as Cynthia, let me know that.

I'd also like Cynthia or Minaxi to tell me what our deadlines are
on this. How much time do we have before the work starts
affecting other users as well and, worst of all, impacts the
editorial boards.

What I do after that will depend on how much damage there is,
when our last backup was, and how much work has been done since
the last backup.

If we have a production backup from the night of March 31 / April
1, I would like to try the following:

Restore that last backup on DEV (not production).

Run all of the imports that were run on and after April 1.
There appear to be 131 total citations.

Test to see if everything works.

If it does, then we have some reason to believe that the disk
space problems were the cause of the error. I believe that
most of the software is identical. The coversheet program is
different on dev and not up to date but I can fix that before
the test. It shouldn't affect this test anyway.

I'd also want to test to see if this, combined with the disk
space fix, also fixes OCECDR-3337. If we're lucky we might get
a twofer.

If everything works, I would then want to know how hard it is to
redo any work done between April 1 and today. What I'd be hoping
to do is restore the same backup on production, redo the few
imports, and redo whatever work was done since the failure.

If we don't have a recent backup, and/or if the amount of user
work is too great, then I'll see what I can do to fix things in
the live database. However, since I don't yet know what damage
may have been done in the database, and none of this is
documented anyway, it will be a challenge and could take a long
time.

My first step might be to take a backup of production, restore it
on test, and go to work there, so that if I make a mistake and
break something else it won't do any harm.

I think I need to go out and buy some Maalox.

Comment entered 2011-04-06 09:49:50 by Boggess, Cynthia (NIH/NCI) [C]

BZDATETIME::2011-04-06 09:49:50
BZCOMMENTOR::Cynthia Boggess
BZCOMMENT::4

Minaxi was able to publish the 2 citations that failed to publish when we first reported this bug. It seems having more disk space available has resolved this issue. We got lucky! So save your Maalox for another day 🙂

Comment entered 2011-04-07 09:34:57 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2011-04-07 09:34:57
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::5

Tagged issue appropriately.

Comment entered 2011-04-07 12:57:26 by Boggess, Cynthia (NIH/NCI) [C]

BZDATETIME::2011-04-07 12:57:26
BZCOMMENTOR::Cynthia Boggess
BZCOMMENT::6

We have published about 150 citations and recorded decisions for a number of them with no additional errors.

Comment entered 2011-04-14 23:55:48 by alan

BZDATETIME::2011-04-14 23:55:48
BZCOMMENTOR::Alan Meyer
BZCOMMENT::7

No one has responded to an email from Robin requesting that errors be reported. It appears to have run fine for all users in the last week.

We have taken steps to improve the backup procedures and to prevent backups from filling the disk again. Hopefully this error will not recur.

I'm marking it resolved-fixed.

Comment entered 2011-04-15 08:50:59 by Juthe, Robin (NIH/NCI) [E]

BZDATETIME::2011-04-15 08:50:59
BZCOMMENTOR::Robin Juthe
BZCOMMENT::8

I did get a couple replies from other Board managers, just confirming that everything was working as it should. So this issue can probably be closed, but I will let Cynthia do so as the QA. Thanks!

Comment entered 2011-04-15 11:02:57 by Boggess, Cynthia (NIH/NCI) [C]

BZDATETIME::2011-04-15 11:02:57
BZCOMMENTOR::Cynthia Boggess
BZCOMMENT::9

Closing this bug. All issues resolved.

Elapsed: 0:00:00.001793