CDR Tickets

Issue Number 5216
Summary Summary Error - Approved text not showing properly on report
Created 2023-03-09 17:20:31
Issue Type Task
Submitted By Shields, Victoria (NIH/NCI) [E]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2023-04-25 17:11:05
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.340621
Description

In CDR811763, the approved markup in the document only shows as bold on the underline/bold report. The document must have an error, but I haven't been able to figure out what it is.

Comment entered 2023-03-10 10:42:00 by Shields, Victoria (NIH/NCI) [E]

I just tried this again and the approved text shows up properly as underline/bold on the report, but when I copy and paste it into Word, it only shows as bold.

Comment entered 2023-03-17 14:13:09 by Englisch, Volker (NIH/NCI) [C]

I want to leave a couple of notes of what I have tried so far to let everyone know and so I won't forget.

I copied the document CDR62675 (Hodgkin Lymphoma Treatment) from PROD to the QA server to test the issue.  On QA, I first had to remove/replace a bunch of citations that don't exist on QA at the moment.  Once the file was edited I was able to run a BU report and recreate the problem:  Copy/pasting the report into Word changes the formatting for text that's bold/underline into bold only.  This happens regardless of which paste option is used in Word:  CTRL-V, Right-click paste (keep source formatting), or Home menu paste option (keep source formatting).

The user is able to paste the document without issues when

  • Only the text (excluding the TOC) is pasted
    One could copy/paste the document in two parts, TOC first, then body of the document before running the Word Macro

  • Portions of the text are pasted

It appears the issue only shows up when the TOC is part of the portions that are pasted into Word.

I have been able to identify a single point in the document from which going forward, the pasted content looses the underline formatting.  I can paste text up to that point with the formatting preserved but adding a single element (here an anchor link for a citation) the underline is lost.  Specifically, the document contains the following small paragraph:

Hypothyroidism. Hypothyroidism is a late complication primarily related to radiation therapy.[66, 67, 68] Long-term survivors who receive radiation therapy to the neck are followed up with annual thyroid-stimulating hormone testing.

 The citations "[66, 67, 68]" are links in the QC report.  When I copy text up to and excluding the link for citation 66, the paste works correctly.  Including citation 66, the underline is lost.

I am also able to change text before and after this particular point as long as that changed text doesn't change the number of citations or the TOC.  When those change the point-when-underline-is-lost is changing.

 

... to be continued...

Comment entered 2023-03-23 20:42:53 by Englisch, Volker (NIH/NCI) [C]

I wanted to see if we can reproduce the problem with a summary using the new XMetaL 17, so I was looking for another summary showing the issue.  It seemed that only summaries with a lot of edits - possible only large summaries - are showing this problem and I'm wondering if this issue is related to some kind of memory limits.  As before, I tried to find the point at which the copy/paste started to drop the "underline" when pasted into Word and I found out that this started to happen around a deeply nested section title.  This section title is displayed as a H5 heading in the HTML.  I changed the H5 into a H4 and immediately, the copy/paste started working again for the full QC report.

With this information I went back to the test document I created on QA (using Victoria's document) and modified in the HTML BU document all H5 titles that were inserted into H4s.  This did not solve the problem.  However, once I changed all H5 titles - not just the newly inserted ones - the copy/paste started working again. 

So, my current theory is that only summaries with a deeply nested SummarySection structure show this behavior.  These are most likely large documents to begin with.

Next, I will test if a document must have a lot of edits and therefore pointing to a memory problem or if the presence of a H5 title alone is enough to cause the issue. I also would like to find out if I can create a much shorter document which displays the problem.  This should be possible if the existence of a H5 title is enough to cause the problem.

 

... again, to be continued ...

Comment entered 2023-03-23 20:46:42 by Englisch, Volker (NIH/NCI) [C]

While testing I came across an invalid style definition that is corrected now:

  • CDR380958.xml - Module: STYLE QC Summary

Comment entered 2023-03-24 13:17:36 by Englisch, Volker (NIH/NCI) [C]

New interesting finding.  Please see the two screenshots from the same document - one pasted into a blank Word document, the other pasted into an existing Word document (CTRL-A CTRL-V):

 

 

When the BU report is pasted to an existing Word document which was formatted correctly the pasting of a problematic document will be formatted correctly as well!

Comment entered 2023-03-24 16:49:36 by Englisch, Volker (NIH/NCI) [C]

The wonderful thing about standards is that there are so many of them to choose from!"

While trying to get to the bottom of this issue it seemed I was falling deeper and deeper into this rabbit hole.  One reason being Word's insistence of helping you formatting pasted text, the other being the many different ways one can paste text into Word. 

  • One can paste using CTRL-V to paste from the clipboard
    This will display a Paste Options button with additional options: Keep Source Formatting, Keep Text Only, Merge Formatting, ...

  • One can paste by right-clicking the mouse
    Again, similar options as above

  • One can paste using the Paste button on the left of the Home ribbon

As mentioned earlier these options may change or additional options may be offered depending if we're pasting into a new, blank document or overwriting text of an existing Word document.

Then, Word also offers a load of additional paste options like "Keep bullets with Keep Text Only option"; "Use smart cut and paste"; etc.  

In other words, given all these combinations and options I haven't been able to reliably identify what is causing Victoria's document to come out with bolded text only when it should be bold/underline.  Over the course of my tests I hadn't been able to observe the behavior with smaller documents, but also big documents don't always display the problem.  I believe the issue is related to some type of memory limitation because any type of link (citation link, heading, TOC link, etc.) seem to play a role.  I had written earlier, I was able to eliminate the problem by eliminating H5 elements from the HTML output.  However, creating an H5 element within a smaller document did not trigger the issue, so that my conclusion is that only large documents with a significant amount of changes is necessary before the problem appears.

That's my status on the current status on finding a solution:  There isn't one yet!

Now, to the good news: I did find a completely new way of "pasting" a QC report into Word and this (new to me) approach displays Victoria's document without any issue!

Follow these steps:

  • Run the QC report as before

  • Now, instead of copying the text content (using CTRL-A CTRL-C) just highlight the URL of the report and copy it

  • Open Word, of if it's already open go to File --> Open

  • Click Browse

  • Paste the URL into the entry field for "File name:"

  • Click Open

If you follow these steps, the document will be loaded read-only and you must save the document first before you can edit it.  Since the document will need to be saved anyway, I hope it's not too much of a burden to first copy it before making any additional edits or running the macros.

, if you could try these steps on PROD using your troublemaker summary and let me know the result that would be great! I will wait to hear from you before I do any additional Word-is-so-much-fun activities. 🙂

Comment entered 2023-04-06 17:22:23 by Englisch, Volker (NIH/NCI) [C]

, I was wondering if you had a chance to try out this "new" way of copy/pasting the QC report to Word and - hopefully - confirm it does display the BU QC report correctly.

Comment entered 2023-04-27 17:46:53 by Shields, Victoria (NIH/NCI) [E]

I tested this on the same version of the summary that had the error and it did not work for me. Perhaps we should have a quick screen sharing call so you can see what I'm seeing (and see if I'm doing something wrong)?

FWIW, when I run the bold/underline on the CWD and convert it to Word, it is fine. The error might only be related to this specific version of the summary? Another summary was copied and pasted into a new template to create this summary, and the version I ran the report on had a lot of markup, so there were many places for something weird to be hiding.

Comment entered 2023-06-07 15:52:29 by Shields, Victoria (NIH/NCI) [E]

Another document has this same error. I added approved text to Colon Cancer Treatment (62687) and ran the bold/under report, copied it to a Word doc, and it is only showing as bold.

Comment entered 2023-06-07 16:18:50 by Englisch, Volker (NIH/NCI) [C]

Is this for a specific version or the current working document?

I will try to replicate the issue on my machine.

Comment entered 2023-06-07 16:28:02 by Englisch, Volker (NIH/NCI) [C]

, have you tried to load the document the "new" way as I described?  I can confirm that the CWD of your summary drops the underline when you copy/paste into Word but the underline is preserved when using the "Open file with URL" option.

Comment entered 2023-06-07 16:48:02 by Shields, Victoria (NIH/NCI) [E]

I haven't tried with your new way. I think an online meeting would be helpful to take a look!

Comment entered 2023-06-08 17:13:29 by Englisch, Volker (NIH/NCI) [C]

and I met to go over the instructions on how to load a document into Word so that the BU markup is preserved.  For the one document that we looked at the result was correct.  Victoria will test with a couple other documents to confirm before this approach will be accepted as a valid work-around.  At that point the ticket can be closed.

While we were testing, another small issue came up:

If the user clicks on the TOC of the QC report to check out one of the Bold/Underline changes, then the proposed approach will fail (due to the fact that the parmID value does now include a fragment ID).  I will enter a ticket for Quinn to fix this minor issue.

Comment entered 2023-07-12 16:27:45 by Shields, Victoria (NIH/NCI) [E]

I tested the cut and paste method on the Colon summary (knowing it had problems, as mentioned in a previous comment) and the approved text still only showed as bold (instead of bold/underline) after I ran the macro, but when I copied and pasted the URL and ran the marco, the text showed as bold/underline. I tested a few other documents and this method worked on those, too. (Although, full disclosure, I wasn't able to reproduce the original problem on those documents. But this procedure seems to work on documents with and without markup issues.)

I think the ticket can be closed.

Comment entered 2023-07-12 17:04:23 by Englisch, Volker (NIH/NCI) [C]

Just a quick note:

Running the macro is unrelated to this issue.  The bold/underline issue is a function of the copy/paste action. The markup is already missing once the text has been pasted and before the macro is run.

I do agree that this ticket can be closed.

Attachments
File Name Posted User
Screenshot 2023-03-24 at 12.55.19.png 2023-03-24 13:06:49 Englisch, Volker (NIH/NCI) [C]
Screenshot 2023-03-24 at 12.55.46.png 2023-03-24 13:07:36 Englisch, Volker (NIH/NCI) [C]
Screenshot 2023-03-24 at 16.26.53.png 2023-03-24 16:53:29 Englisch, Volker (NIH/NCI) [C]
Screenshot 2023-03-24 at 16.29.12.png 2023-03-24 16:54:00 Englisch, Volker (NIH/NCI) [C]
Screenshot 2023-03-24 at 16.30.37.png 2023-03-24 16:54:09 Englisch, Volker (NIH/NCI) [C]
Screenshot 2023-03-24 at 16.32.01.png 2023-03-24 16:54:19 Englisch, Volker (NIH/NCI) [C]

Elapsed: 0:00:00.001400