CDR Tickets

Issue Number 4687
Summary [Summaries] Itemized List Missing Letters After Copy/Paste Into Word
Created 2019-10-30 12:45:22
Issue Type Bug
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2021-01-14 13:46:11
Resolution Won't Fix
Path /home/bkline/backups/jira/ocecdr/issue.251821
Description

There is a large itemized list containing nested itemized and bulleted lists in the Adult Soft Tissue Sarcoma TEMP summary (CDR795857 on PROD).

 

The list looks fine in XMetaL and in the HTML QC report, but some of the letters within the itemized list are missing upon copying/pasting it into Word. The attachment includes a screenshot of a portion of the list in both HTML (B/U QC report) and in Word.

Comment entered 2019-10-31 18:08:42 by Englisch, Volker (NIH/NCI) [C]

On my Windows version of Word I do see the same thing happening but the dots (instead of the (a)) are starting at number (6) going forward.

On my Mac version I wasn't even able to paste the entire document into Word.  It appears there is a size limit.  When I just copied the portion that contained the long itemized, nested list, the problem did not show up. However, the Mac version of Word numbered all list items with numerals.

I also checked the HTML created by our filters but the html structure looks identical between the different list items.

I would have to agree with Bob.  This looks like a MS-Word bug.

Comment entered 2020-01-09 17:25:08 by Englisch, Volker (NIH/NCI) [C]

I poked around this issue a little more and have been able to copy/paste the nested list into Word while preserving the numbering (this was tested with Word on a Mac).  It appears there are two methods of specifying the numbering for ordered lists - using CSS and using an HTML attribute for the "ol" (ordered list) element.  This HTML type attribute is deprecated in HTML 4 but supported in HTML 5.  I will still need to check if we're running into other issues if we're specifying CSS and the type attribute for the display of the ordered lists.

Comment entered 2020-01-31 17:44:09 by Englisch, Volker (NIH/NCI) [C]

I discussed the approach with Mark Cramer.  He feels there shouldn't be any problems adding the type attribute to the QC filters.

Comment entered 2020-02-14 13:33:42 by Kline, Bob (NIH/NCI) [C]

I have created a repro case, small enough that it would be suitable for submitting with a bug report against Microsoft Word. The HTML document, which demonstrates the problem without any CSS, can be retrieve at https://rksystems.com/ocecdr-4687-repro.html and the entire page copied into the system clipboard (Windows or Mac) and pasted into a blank Word document to demonstrate the bug. Notice that without any CSS style rules, the alpha "numbering" to override the browser's defaults is not in effect, which actually makes it more obvious what's going on (the bug causes the nested list for the third item to start with 0 instead of 1). Even if we decide not to open a ticket with Microsoft (we might as well, since we're paying for support anyway), I think this makes it clear that there's nothing the CDR (or cancer.gov) is doing to cause the problem, or could do to eliminate it, short of suppressing nested lists altogether. More screen shots attached.

Comment entered 2020-02-14 13:36:09 by Kline, Bob (NIH/NCI) [C]

One fascinating thing I noticed is that Apple's Finder (the Mac equivalent of Windows Explorer) displays the nested list numbering correctly when it previews the broken Word document. 🙂

Comment entered 2020-02-14 13:37:26 by Englisch, Volker (NIH/NCI) [C]

Wow!

Comment entered 2020-02-14 13:40:29 by Englisch, Volker (NIH/NCI) [C]

I'm actually surprised you're able to replicate the issue with a small list like this.  I thought long lists are the issue but maybe the level of nesting is what's causing the problem instead or maybe it has to do with the alternating of list levels.  Very interesting. 

I guess I can toss my filter changes and revert back to the original. 🙂

Comment entered 2020-02-14 13:43:59 by Kline, Bob (NIH/NCI) [C]

As pictured here:

Comment entered 2020-02-14 13:51:35 by Kline, Bob (NIH/NCI) [C]
Comment entered 2020-02-24 12:17:55 by Englisch, Volker (NIH/NCI) [C]

, I thought you had included a link to this ticket regarding how to fix the lists after they were pasted into Word but I can't find it.  

Could you please add the link?  I'd like to try it out.

Comment entered 2020-02-24 13:32:57 by Osei-Poku, William (NIH/NCI) [C]

Here is the link to the article. I haven't had the time to try it yet but I will do so later today.

https://support.microsoft.com/en-us/help/275968/a-multilevel-numbered-list-changes-when-you-paste-a-list-into-a-new-do

Comment entered 2020-02-24 13:49:52 by Englisch, Volker (NIH/NCI) [C]

Thanks, .  I had actually seen this article. The problem reported here indicates that there are indeed issues with the numbering for pasted nested lists, however, they are describing that the copy/paste of a list is causing the pasted lists to continue the numbering sequence of the previous lists.  In our situation, the copy paste is causing the sub-list to start numbering at zero (0).

Our problem and the problem described here may be related but the workaround is basically:  Go in and manually renumber every list/sub-list.  That is what Robin is already doing and isn't particularly helpful in dealing with the problem.

Comment entered 2020-02-24 14:19:31 by Osei-Poku, William (NIH/NCI) [C]

Thanks for the explanation, Volker! One of the workarounds I found did not require manual renumbering of the list (as suggested by MS in this article). Rather, you have to select the list missing the letters/number, then go to the multilevel list option and select the correct list format. I tried that now and it seems to work. You may want to try that also to see if it works.

Comment entered 2020-02-24 14:36:11 by Englisch, Volker (NIH/NCI) [C]

Yes, that seems to be working well.  You will still have to go through each messed-up list one-by-one to apply the new setting but at least you can repeat the change from the first list to each of the following lists by highlighting the list and hitting CTRL-Y.

I had tried to adjust the missing label for the list-items that were wrong.  This didn't always adjust the following list items.  I guess the trick here is that you must highlight all list-items that belong to the wrongly numbered nested list.

Comment entered 2020-02-27 16:14:29 by Englisch, Volker (NIH/NCI) [C]

Here are the steps I performed to manually fix the lists:

  1. Paste the QC report into Word

  2. Find the WHO classification of soft tissue sarcomas

  3. For me the problem starts with the sub-list of item (6) of the list (Skeletal muscle tumors)

  4. Put the cursor in front of the word "Benign"

  5. Highlight this and all following lines up until (and excluding) item (7 - Vascular tumors)

  6. With the "Home" tab of Word selected find the "Paragraph" group (next to the Font group) and select the drop-down menu of the "Multilevel List" option

  7. Select "Define New Liststyle..."

  8. In the window that comes up don't make any changes - just click "OK"
    --> The numbering of your list items will be corrected!

  9. Go to the next list item with a numbering issue (7 - Vascular tumors)

  10. Highlight everything between the main items (7) and (8)

  11. Click CTRL-Y (i.e. repeat previous change)

  12. Repeat for each of the mis-numbered list items.

Comment entered 2020-03-06 09:22:41 by Kline, Bob (NIH/NCI) [C]

CBIIT is opening a Microsoft support ticket. They've made me the primary contact, so I'll probably be communicating directly with MS.

Comment entered 2020-03-13 15:20:33 by Kline, Bob (NIH/NCI) [C]

 or  - could one of you (or any user who's running on Windows instead of Mac) bring up the "About Microsoft Word" dialog box and give me the version number? Thanks.

I'm working with the Microsoft specialist assigned to our case.

Comment entered 2020-03-13 15:53:58 by Englisch, Volker (NIH/NCI) [C]

For my Parallels Windows installation I see:
Subscription Product: Microsoft Office 365 ProPlus
Version: 1908 (Build 11929.20606)

That should be close to what Robin and William have.

Comment entered 2020-03-13 16:06:31 by Juthe, Robin (NIH/NCI) [E]

Where can I find "About Microsoft Word"?

Comment entered 2020-03-13 16:13:27 by Kline, Bob (NIH/NCI) [C]

I'm going to guess under Help.

Comment entered 2020-03-13 16:26:50 by Englisch, Volker (NIH/NCI) [C]

No, there isn't a "About Word" anymore.  Go to "Account".  You will see the version under "Product Information"

Comment entered 2020-03-13 16:30:36 by Juthe, Robin (NIH/NCI) [E]

Well, I tried to upload a cropped version of that screenshot but it grabbed the whole thing. Anyway, it looks like I have the same version # as Volker: 1908.

Comment entered 2020-03-25 19:48:34 by Kline, Bob (NIH/NCI) [C]

Just got off the phone with Microsoft support. They have tracked this down to a bug in Word and have determined that they can't fix it. They provided two workarounds, one manual, and one which could be applied in our code. The second would be preferable since it's automated, but has the flaw that while it corrects the numbering it messes up the spacing. Will post details tomorrow morning.

Comment entered 2020-03-26 05:47:55 by Kline, Bob (NIH/NCI) [C]

Here are the two bug workarounds provided by Microsoft.

First workaround: Manually apply the formatting from an unbroken section of the list to the broken section. Here are the steps to use this workaround to the repro case I created:

  1. Select the pilcrow icon on the Home toolbar to show invisible whitespace

  2. Select the pilcrow (editor's paragraph mark) for the first line in a list which has the formatting we want to copy.

  3. Click the format painter icon ("Copy formatting from one location and apply it to another") which looks like a paintbrush on the left side of the Home toolbar:

  4. Click the pilcrow mark on the comparable line for the broken section of the list (in this case the line immediately above the incorrectly numbered list item).

  5. Note that the number has been corrected.

The second workaround can be applied programmatically in the filters, which would be modified to add a start="1" attribute to each ol element. For example:

 

   <li>Aliquam ut porttitor leo.
    <ol start="1">
      <li>I should be item number 1 in the nested list.

This workaround triggers a second bug, introducing unwanted extra whitespace immediately before the location of the corrected numbering. For example:

As noted earlier, both workarounds have drawbacks. Turning this ticket back to Volker and the users to decide which workaround to use (and implement, if the second workaround is preferred).

Comment entered 2020-04-01 10:38:11 by Englisch, Volker (NIH/NCI) [C]

I've modified the filter on DEV to set the start="1" attribute and it does fix the issue.  I also haven't seen the extra spacing.  However, I will now have to modify our filters to only set this attribute for inner lists because the top level list will now be renumbered as soon as the problem nested list hits the bug.

Comment entered 2020-04-01 11:11:40 by Englisch, Volker (NIH/NCI) [C]

Jira doesn't like me today.  I tried to attach an image to show the issue but Jira refuses.  Maybe later today.

Comment entered 2020-04-01 15:25:00 by Englisch, Volker (NIH/NCI) [C]

I have good news and bad news on this issue - which one would you like to hear first?  OK, bad news first!

Bad News

The workaround suggested by Microsoft does not work!  it gets us closer but it's not a solution by itself.  As I mentioned above, I modified the filter to add the start="1" attribute MS offered as a workaround but by doing so the "parent" list started to be renumbered as well list this:

  1. 1. dada
    {{ 2. dada}}
    {{ 3. dada}}
    {{   a. dudu}}
    {{   b. lulu}}
    {{   c. lala}}
    {{ 2. dada    <<<< Parent list renumbered}}
    {{ 3. dada}}{{}}

I was hoping I could fix this problem by only adding the start="1" attribute to the secondary ordered list but that change did not fix the problem.

Good News

There is, however, one extra step we would have to do in order to fix the parent ordered list again.  

  • highlight the entire list

  • right-click and turn off numbering

  • right-click and select numbering again to reset the numbering

This will redraw all labels of the parent list and display the list-items from 1-11.

Comment entered 2020-04-01 15:26:41 by Englisch, Volker (NIH/NCI) [C]

FYI:  This pre-formatting thing in Jira is still a work-in-progress, I hope.  The list did not look like that when I typed it in but I hope you all will still get the picture.

Comment entered 2020-04-01 16:12:12 by Kline, Bob (NIH/NCI) [C]

You have to be careful to notice when you're in visual mode and when you're in text mode.

Comment entered 2020-04-01 16:12:43 by Kline, Bob (NIH/NCI) [C]

Which tier is this on? I'd like to look at the HTML markup.

Comment entered 2020-04-01 16:30:06 by Englisch, Volker (NIH/NCI) [C]

It's on DEV.

Comment entered 2020-04-01 18:01:54 by Kline, Bob (NIH/NCI) [C]

I don't see the originally reported list in the QC report for CDR795857 (the summary given in the original ticket description) on DEV. Do you have a repro case I can look at?

Comment entered 2020-04-01 18:48:54 by Englisch, Volker (NIH/NCI) [C]

You don't see the section "Cellular Classification of Adult Soft Tissue Sarcoma"? 

That's the one I've been using.

Comment entered 2020-04-02 09:39:30 by Kline, Bob (NIH/NCI) [C]

Your markup is as it should be. I have extracted the list to https://rksystems.com/numbered-lists.html and reported the failure to Microsoft on the thread for the original ticket.

Comment entered 2020-04-02 10:26:06 by Englisch, Volker (NIH/NCI) [C]

Jira is in a good mood and granted me permissions to attach the file it rejected yesterday.  Didn't know Jira is playing April Fools jokes. 🙂

Comment entered 2020-04-09 15:09:59 by Juthe, Robin (NIH/NCI) [E]

This issue does seem to be apparent in summaries other than the one mentioned above. Examples include:

 

-Childhood Rhabomyosarcoma

-Childhood Soft Tissue Sarcoma

Comment entered 2021-01-13 12:45:27 by Englisch, Volker (NIH/NCI) [C]

Should we be moving this ticket to the Parking Lot sprint or close it since we're basically waiting for Microsoft to fix MS-Word?

Comment entered 2021-01-14 13:46:11 by Kline, Bob (NIH/NCI) [C]

Microsoft will not be doing any further investigation into this problem.

Attachments
File Name Posted User
About Word.png 2020-03-13 16:29:10 Juthe, Robin (NIH/NCI) [E]
Bulleted List Problem 10_30_19.docx 2019-10-30 12:44:36 Juthe, Robin (NIH/NCI) [E]
image-2020-03-26-05-24-02-018.png 2020-03-26 05:24:02 Kline, Bob (NIH/NCI) [C]
image-2020-03-26-05-27-30-110.png 2020-03-26 05:27:30 Kline, Bob (NIH/NCI) [C]
image-2020-03-26-05-30-41-623.png 2020-03-26 05:30:42 Kline, Bob (NIH/NCI) [C]
image-2020-03-26-05-34-52-735.png 2020-03-26 05:34:53 Kline, Bob (NIH/NCI) [C]
image-2020-03-26-05-35-38-164.png 2020-03-26 05:35:38 Kline, Bob (NIH/NCI) [C]
image-2020-03-26-05-45-10-020.png 2020-03-26 05:45:10 Kline, Bob (NIH/NCI) [C]
ocecdr-4687-repro.docx 2020-02-14 13:21:25 Kline, Bob (NIH/NCI) [C]
Screen Shot 2019-10-31 at 18.05.55.png 2019-10-31 18:06:24 Englisch, Volker (NIH/NCI) [C]
Screen Shot 2020-02-14 at 1.21.06 PM.png 2020-02-14 13:22:10 Kline, Bob (NIH/NCI) [C]
Screen Shot 2020-02-14 at 1.21.56 PM.png 2020-02-14 13:22:57 Kline, Bob (NIH/NCI) [C]
Screen Shot 2020-02-14 at 1.37.43 PM.png 2020-02-14 13:42:13 Kline, Bob (NIH/NCI) [C]
Screen Shot 2020-04-01 at 10.39.31 AM.png 2020-04-02 10:21:05 Englisch, Volker (NIH/NCI) [C]

Elapsed: 0:00:00.001251