CDR Tickets

Issue Number 4445
Summary [Summaries] Reference numbers for placeholder CitationLink elements
Created 2018-03-28 16:01:47
Issue Type Bug
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2018-05-24 13:27:50
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.223245
Description

We often add empty CitationLink elements as placeholders when we're entering proposed summary changes. The reference list in the summary QC reports used to preserve those references in sequence and have a blank line beside the placeholder reference. For example, if the placeholder CitationLink element was located where the 20th ref would occur, then there would be a blank line beside ref 20 in the reference list, and all other references would be numbered accordingly.

We noticed that the "blank" reference is now showing up as ref 1 in the reference list, which is misleading and typically inaccurate (unless the placeholder reference is actually the first reference in the section). We think this is a new problem possibly introduced in Gauss.

I'll add the CDR ID and version number of a doc on PROD where you can observe this issue.

Comment entered 2018-03-28 16:03:35 by Juthe, Robin (NIH/NCI) [E]

To see an example of this issue, check out V250 of CDR 62911 (Gastric Cancer) on PROD. In the Stage IV/Recurrent section, Immunotherapy section, Pembrolizumab subsection, there are empty brackets in #1.

Comment entered 2018-04-03 12:24:50 by Englisch, Volker (NIH/NCI) [C]

Do I understand this correctly? The problem is not with the citation within the text. The problem is only the sort order of where that blank "citation" appears. It should be listed as citation #22 instead of #1 and all citations up to #22 are now shifted by one, right?

Comment entered 2018-04-03 12:43:39 by Englisch, Volker (NIH/NCI) [C]

By the way, this looks like a typo: Fuchs absract

Comment entered 2018-04-03 13:12:08 by Englisch, Volker (NIH/NCI) [C]

It seems very likely that this change was introduced with Gauss. I found a work-around one could use until the filters have been adjusted - though there was no filter change causing this problem:
When entering an empty CitationLink go to the @cdr:ref attribute and enter a blank space. This will sort the empty citation properly but only if you have no more than one empty citation within a section because our filters are removing duplicates.

Comment entered 2018-04-05 13:15:17 by Juthe, Robin (NIH/NCI) [E]

Yes, your description is correct.

Comment entered 2018-04-05 13:16:53 by Juthe, Robin (NIH/NCI) [E]

We often have more than one empty CitationLink, but I'll let Victoria know that this is a potential work-around in the instances where she has just one missing citation. Thanks. Should this go in the release independent queue or does it require a release?

Comment entered 2018-04-05 15:16:07 by Englisch, Volker (NIH/NCI) [C]

The following filter has been updated to fix the citation sorting issue:

  • CDR0000335418.xml: Denormalization Filter: Summary InLine Numbering

Please take a look on DEV.

Comment entered 2018-04-05 15:22:22 by Englisch, Volker (NIH/NCI) [C]

For my information:
Changes are in the local branch cdr4445-citsort.

Comment entered 2018-04-26 13:06:21 by Englisch, Volker (NIH/NCI) [C]

, were you or able to take a look at these changes to the citation sorting?

Comment entered 2018-04-26 13:21:06 by Juthe, Robin (NIH/NCI) [E]

I have not looked at this yet. , maybe we can review this together tomorrow?

Comment entered 2018-04-27 16:59:35 by Juthe, Robin (NIH/NCI) [E]

and I both reviewed this today and it looks pretty good on DEV. Victoria noticed something about the placement of the empty reference in a string of references and we weren't sure if it had always been that way of if this is a new "feature". 🙂 We can certainly live with it as is, and it's definitely an improvement, but I thought I'd mention it in case it's a simple fix. Victoria summed this all up much better than I could so I'm going to paste her comments below:

"So it’s better, and certainly we can live with it, but I don’t think it’s working quite like it used to. I used everyone’s favorite test summary, Bladder, for this. On DEV. I added citation links in the Treatment Option Overview and the Stage 0 Treatment section. I added a sentence near each ref so it could be found easier.

Where I added the ref by itself, the report looks good.

Where the ref was added as the last ref in a group of refs, the text is marked in a way that looks like the first ref is the missing ref. But the references list is correct. In the TOO, the first ref I added will be #10 and there’s a blank by #10 in the refs list, but the space in the text is at the beginning of the group, not the end, and looks like #5 is missing.

I’m perfectly happy with this, and it might even have worked this way before."

Comment entered 2018-04-27 18:00:29 by Englisch, Volker (NIH/NCI) [C]

So basically, the sort order in the reference section is now correct but the order in which the empty reference is displayed within a reference range is still broken. I would have to turn in my German passport if I were to let this go through.

It is definitely thorough testing! I'll take a look on Monday.

Comment entered 2018-05-23 11:22:43 by Englisch, Volker (NIH/NCI) [C]

The way the citation numbering is done in the filters is something like this:

  1. Collect all the CitationLink elements in the Reference section

  2. Sort all of the citations

  3. De-dup all of the citations - Now we have the correct sort-order in the Ref-Section.

  4. Go back to the text and assign the citation number assigned in the Ref-Section to the citation within the text. Matching the citations is done by matching the CDR-IDs of the citation.

Since the empty CitationLink elements don't include a CDR-ID yet we're unable to link multiple empty citations to the appropriate in-line CitationLinks. I have modified the filters to link any empty in-line CitationLink element within a section to the sequence number of the first of (possibly) multiple empty CitationLink elements.
Based on a discussion with and this approach will work because a situation when two empty citations are added to the same section is not common.

Comment entered 2018-05-23 19:55:49 by Englisch, Volker (NIH/NCI) [C]

I think I finished all that's needed in order to display the citations with approved (red) and proposed (green) markup.

Comment entered 2018-05-24 13:27:40 by Englisch, Volker (NIH/NCI) [C]

The following filters have been updated:

  • CDR0000339576: Module: InLine Markup Formatter

  • CDR0000335417: Denormalization Filter: Summary Reference Numbering

  • CDR0000335418: Denormalization Filter: Summary InLine Numbering

  • CDR0000335169: Module: STYLE Default

  • CDR0000321373: Revision Markup Filter for QC Report

  • CDR0000315892: Clean up Insertion and Deletion

  • CDR0000000079: Health Professional Summary Report

This is ready for review on DEV.

Comment entered 2018-05-25 16:08:04 by Englisch, Volker (NIH/NCI) [C]

The modified filters have been copied to QA. I'm currently running before/after diff reports.

Comment entered 2018-05-29 15:09:53 by Englisch, Volker (NIH/NCI) [C]

There were no surprises running the diff reports for all document types on QA. All of the changes between the before and after reports were expected.
, please take a look and let me know when you'd like me to copy these changes to PROD.

Comment entered 2018-05-30 17:57:57 by Juthe, Robin (NIH/NCI) [E]

This looks good on QA. You can promote it to PROD. Thanks!

Comment entered 2018-05-31 13:55:06 by Englisch, Volker (NIH/NCI) [C]

The filter changes have been copied to STAGE and PROD.
https://github.com/NCIOCPL/cdr-server/commit/41d02f3

Comment entered 2018-06-07 11:12:42 by Juthe, Robin (NIH/NCI) [E]

Verified on PROD - thanks!

Elapsed: 0:00:00.001557