PDQ Issues

Issue Number	4378
Summary	[Gauss QA Testing] Error Message When Checking In Summaries
Created	2018-01-09 12:26:21
Issue Type	Bug
Submitted By	Juthe, Robin (NIH/NCI) [E]
Assigned To	Kline, Bob (NIH/NCI) [C]
Status	Closed
Resolved	2018-01-12 09:47:18
Resolution	Fixed
Path	/home/bkline/backups/jira/ocecdr/issue.219500

Description

I've checked out and checked in multiple summaries and I'm getting an error message each time I try to check in a summary.

The message says "sendCommand: unexpected failure", to which I hit "OK", and the document remains open on my desktop. Then when I try to save it again, it says that the document is not locked, so it does appear to be saving the document despite the failure message.

Comment entered 2018-01-09 14:11:26 by Kline, Bob (NIH/NCI) [C]

Can you give it another shot, please?

Comment entered 2018-01-09 15:00:38 by Juthe, Robin (NIH/NCI) [E]

I'm still getting the error message. It just happened on 2 out of my 3 attempts to save a summary.

Comment entered 2018-01-10 11:21:04 by Kline, Bob (NIH/NCI) [C]

Please try again. If it still fails, please tell me which summaries and when.

Comment entered 2018-01-10 14:57:10 by Osei-Poku, William (NIH/NCI) [C]

At least one user has reported getting this error message a few minutes ago when trying to view a newly imported citation 791975.
I also received this error message on Friday but that was when my VPN connection was not stable. As soon as I got it stabilized, the message went away. I don't think the user was having the same problem I had with VPN since she was able to import the citation without any problem.

Comment entered 2018-01-10 15:26:23 by Osei-Poku, William (NIH/NCI) [C]

Another user just reported getting a SendCommand error while trying to create a new glossary term and saving it.

Comment entered 2018-01-11 09:02:56 by Juthe, Robin (NIH/NCI) [E]

I'm still running into the sendCommand error. I just received it for the following summaries: 62890, 62863, and 299612. Meanwhile, I didn't get the error when I closed 517309. This all happened within the past few minutes (8:57am-9:02am).

Comment entered 2018-01-11 15:29:30 by Kline, Bob (NIH/NCI) [C]

I was wrong when I said that some of the documents for which you got the error messages did not get saved. It appears that all of the saves succeeded. You just didn't get the response from the server. In fact, the time line looks odd, in that it let you fetch and save a second document while the save of the first document was still being processed.

08:57:27 SAVE CDR62863 COMMAND RECEIVED
08:??:?? CLIENT GETS ERROR MESSAGE 500
08:58:26 GET CDR62890 COMMAND RECEIVED
08:58:?? CLIENT GETS CDR62890 DOCUMENT
08:58:43 SAVE CDR62890 COMMAND RECEIVED
08:59:17 SAVE CDR62863 COMMAND COMPLETES SUCCESSFULLY (!!!)
08:59:31 SAVE CDR62890 COMMAND COMPLETES SUCCESSFULLY

Comment entered 2018-01-12 09:47:18 by Kline, Bob (NIH/NCI) [C]

I have increased the default timeout the client DLL uses to wait for the server's response, and that seems to have gotten rid of the save failures (probably some other failures, too). For saving and validating CDR62863, the entire operation takes about two and a quarter minutes (I have set the timeout to three minutes). The primary activity for this request involves:

79 seconds to collect the link information for 1,775 links (.045 seconds per link)
45 seconds to validate the links (.025 seconds per link)
11 seconds to denormalize the links

If I am reading the logs correctly, schema validation processing time is insignificant (I was pleasantly surprised by this). I'll see if there's anything that can be done to speed up the collection of the link information, as that's something which has to happen even when you don't ask for document validation, but 1,775 links is a lot of links. :-)

Comment entered 2018-01-12 11:07:01 by Kline, Bob (NIH/NCI) [C]

At first I thought there might be a way to shave 20 seconds off the step to collect the link information by asking the parser to just give us elements which have attributes used for linking, but after further digging I realized that the link collection has to examine every node in the document, so that it can make sure that all of the link target IDs are unique, and there are almost 6,000 elements in this particular document. So that loop is actually examining a node every 0.013 seconds, which is not too shabby. Probably will be faster on production. Might have been faster in C++ on QA, but C++ is a compiled language, whereas Python is an interpreted language, so the Python performance is still pretty impressive. Some of these documents are big!

Comment entered 2018-01-12 12:12:19 by Kline, Bob (NIH/NCI) [C]

I have also improved the error message you get when a request times out, and I have added the elapsed time in the alert box for a successful save request.

Comment entered 2018-01-12 15:14:10 by Kline, Bob (NIH/NCI) [C]

Well, how about that! I was able to throw in some low-level alchemy which did succeed in speeding things up measurably. Save times should be significantly lower now.

Comment entered 2018-02-08 11:11:16 by Kline, Bob (NIH/NCI) [C]

Has this been tested since it was fixed?

Comment entered 2018-02-08 11:17:31 by Juthe, Robin (NIH/NCI) [E]

Yes this was tested and verified.

Comment entered 2018-03-08 09:31:02 by Juthe, Robin (NIH/NCI) [E]

Verified on PROD.

Elapsed: 0:00:00.001402

CDR Tickets