CDR Tickets

Issue Number 4991
Summary Invalid at the top level of the document when logging into the CDR - PROD and QA
Created 2021-06-15 11:38:03
Issue Type Improvement
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2021-08-19 10:41:27
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.292198
Description

This appears to be an isolated case that started happening this morning to only one user. When Isabel Lansberry is logging into the CDR on PROD and QA,  she gets the attached error message:

"Invalid at the top level of the document when logging into the CDR - PROD and QA",

After clicking OK, she is able to log into the CDR. However, she is not able to use the Ctrl + Enter macro for lists. The lists do not show up. What shows up is a search dialog box.  

This is happening only on PROD and QA. We tried this on all the tiers and DEV was the only one that appeared to be normal.

Comment entered 2021-06-15 11:55:34 by Kline, Bob (NIH/NCI) [C]

This was working on QA before today?

Comment entered 2021-06-15 12:07:28 by Kline, Bob (NIH/NCI) [C]

This is on QA

Comment entered 2021-06-15 12:11:12 by Kline, Bob (NIH/NCI) [C]

I get the same on PROD.

Comment entered 2021-06-15 12:17:24 by Osei-Poku, William (NIH/NCI) [C]
Comment entered 2021-06-15 12:18:50 by Osei-Poku, William (NIH/NCI) [C]


I get the same on PROD.

Yes, this is happening only to one user (Isabel Lansberry).

Comment entered 2021-06-15 12:20:55 by Osei-Poku, William (NIH/NCI) [C]

We can reinstall XMetal but it is strange that is happening only on QA and PROD and not on DEV.

Comment entered 2021-06-15 12:48:56 by Juthe, Robin (NIH/NCI) [E]

I received the same message just a little while ago when I logged into PROD. I was unable to search for documents. I logged out and logged back in. Upon my second log in, I did not see the error message and everything appears to be working as usual.

Comment entered 2021-06-15 12:50:24 by Kline, Bob (NIH/NCI) [C]

Has she tried the trick of resetting XMetaL? (I don't remember the keystroke combination to be held down at startup, but I bet you do.)

Comment entered 2021-06-15 13:00:50 by Osei-Poku, William (NIH/NCI) [C]
Comment entered 2021-06-15 13:03:53 by Osei-Poku, William (NIH/NCI) [C]

Thanks . She's logged in and out of the CDR several times and even restarted her laptop a few times as well but the issue still persists. We may try to reinstalled XMetal to see if that resolves the problem.

Comment entered 2021-06-15 13:43:28 by Englisch, Volker (NIH/NCI) [C]

It is SHFT-CTRL to reset XMetaL.

Comment entered 2021-06-15 14:00:24 by Englisch, Volker (NIH/NCI) [C]

, is this issue limited to GTN documents or are other document types affected as well?

Comment entered 2021-06-15 14:55:44 by Kline, Bob (NIH/NCI) [C]

Can we set up a screen-sharing session with Isabel on Teams, ?

Comment entered 2021-06-15 15:32:21 by Osei-Poku, William (NIH/NCI) [C]

It affects all document types. It seems to affect drop down lists and cases where she has to do Ctrl + Enter to bring up a list.

Comment entered 2021-06-16 10:21:58 by Osei-Poku, William (NIH/NCI) [C]

CDR is working for the Isabel this morning. We did not have to reinstall XMetal even though there was already a ticket to do so. She turned her laptop off completely overnight and then this morning she was able to access the CDR without the errors she encountered yesterday. For whatever the problem was it looks like the laptop needed a very good night sleep and not just a short nap 🙂.

Comment entered 2021-06-16 10:36:39 by Englisch, Volker (NIH/NCI) [C]

Sounds like Isabel was working too hard last week.  Laptops have feelings, too. 🙂 

Great to hear that Newton isn't to blame for those feelings.

Comment entered 2021-06-16 10:39:20 by Kline, Bob (NIH/NCI) [C]

Pleasant dreams!

Comment entered 2021-06-17 09:53:20 by Kline, Bob (NIH/NCI) [C]

I would have used "Can't reproduce" instead of "Won't fix" as the resolution, as the problem fixed itself. 😉

Comment entered 2021-06-17 10:10:59 by Osei-Poku, William (NIH/NCI) [C]

I have updated the ticket to say Can't reproduce" . I considered that too but I also thought we were able to reproduce the problem several times. Maybe we should have another option "The Problem fixed itself" 🙂

Comment entered 2021-06-17 15:15:38 by Osei-Poku, William (NIH/NCI) [C]

I have reopened this ticket because another user Pedro Bringas Casado, our new translator, is experiencing the same issue. He does not see the initial error message upon log in but does experience missing list items and he is unable to use the ctrl + enter macro. He restarted both the CDR and his laptop to no avail. He is leaving for the day so I will check with him again on Monday whether the problem is resolved or not but it is turning out to be a more widespread problem than initially thought and I am yet to send out the email to all users.

Comment entered 2021-06-17 16:26:16 by Osei-Poku, William (NIH/NCI) [C]

Crystal experienced the problem this morning. She was able to work normally after restarting. Christina also experienced the same problem about an hour ago, the CDR worked after restarting as well.

Comment entered 2021-06-17 17:23:08 by Kline, Bob (NIH/NCI) [C]

Let me know if you get any more reports today (or Monday) and we'll try another remote diagnostic session.

Comment entered 2021-06-21 09:25:06 by Osei-Poku, William (NIH/NCI) [C]

Sure. Pedro's CDR is still showing signs of the problem this morning. I will send you a Teams' invite.

Comment entered 2021-06-21 10:04:24 by Kline, Bob (NIH/NCI) [C]

I've cleared my calendar so I can be available for this whenever you are.

Comment entered 2021-06-21 10:18:43 by Osei-Poku, William (NIH/NCI) [C]

I already sent you an invite in Teams.

Comment entered 2021-06-21 10:31:32 by Kline, Bob (NIH/NCI) [C]

Any chance you sent it to one of the other Bob Klines at NCI?

Comment entered 2021-06-21 10:33:04 by Kline, Bob (NIH/NCI) [C]

Ah, I see it. Your name didn't show up in the left panel.

Comment entered 2021-06-25 09:30:22 by Kline, Bob (NIH/NCI) [C]

Looked at Robin's machine after yesterday afternoon's status meeting, as it was exhibiting some of the symptoms connected with this issue (error message about an invalid document at startup, empty document type picklist for the search dialog, and missing valid values dialogs). The evidence points to a failure of the DLL to parse and extract information from the CdrDocTypes.xml file when it starts up, and one theory was that this file gets replaced as the user switches tiers or when a new DLL is deployed, and the file system hasn't finished unpacking the file from the zip archive retrieved from the server, making the file unreadable. However, the file was clearly present, and Robin had not been on a different tier for her previous login, and that login has been after the Newton release. I will try to add some additional debug logging and hope we can reproduce the problem on the lower tiers.

Comment entered 2021-06-25 09:31:42 by Kline, Bob (NIH/NCI) [C]

I have no idea how to estimate the LOE for this ticket, as we don't know enough about what's causing the behavior.

Comment entered 2021-06-25 16:23:43 by Kline, Bob (NIH/NCI) [C]

As I was investigating possible ways to capture more information about the causes of these random failures by adding more debug logging I found a bug which has been around at least since CBIIT took over our hosting back in 2013 (and probably much longer than that), and is a plausible candidate for the cause of the behavior reported by this ticket. It's the kind of bug which is triggered by the (seemingly) random content of the memory directly adjacent to a buffer the software is populating, which means that the bug does not occur all the time, and conditions which are unrelated to what the currently running code is doing and which can be affected by countless factors over which we have no control (for example, what's in the machine's environment variables, including those unrelated to our software) can determine when and how frequently the bug is triggered. I have fixed the bug (and added more debug logging) on CDR DEV and CDR QA.

Comment entered 2021-06-25 17:07:33 by Kline, Bob (NIH/NCI) [C]

... probably much longer than that ...

Indeed, this bug will be 15 years old tomorrow. 😛

Comment entered 2021-08-19 10:41:20 by Osei-Poku, William (NIH/NCI) [C]

Closing this ticket because no issues have been reported since the hot-fix. Thanks!

Comment entered 2022-04-06 15:47:40 by Kline, Bob (NIH/NCI) [C]

I tried calling you through Teams to follow up on the report of a DOM failure in XMetaL this afternoon, but didn't get a response. Is this the ticket for those failures, or is there another ticket?

Comment entered 2022-04-06 16:46:32 by Osei-Poku, William (NIH/NCI) [C]

I have reopened the other ticket OCECDR-5014

Attachments
File Name Posted User
image-2021-06-15-12-07-21-859.png 2021-06-15 12:07:23 Kline, Bob (NIH/NCI) [C]
MicrosoftTeams-image.png 2021-06-15 11:35:59 Osei-Poku, William (NIH/NCI) [C]
MicrosoftTeams-image (1).png 2021-06-15 11:35:59 Osei-Poku, William (NIH/NCI) [C]
MicrosoftTeams-image (2).png 2021-06-15 11:35:59 Osei-Poku, William (NIH/NCI) [C]
MicrosoftTeams-image (3).png 2021-06-15 11:35:59 Osei-Poku, William (NIH/NCI) [C]

Elapsed: 0:00:00.001336