Issue Number | 3833 |
---|---|
Summary | [Summaries] Module section titles do not display in Summaries TOC report |
Created | 2014-11-18 11:02:34 |
Issue Type | Bug |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | alan |
Status | Closed |
Resolved | 2014-11-25 14:19:04 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.141910 |
Section titles within a summary module document do not display in the summaries TOC report for the summaries that contain the module section(s). However, the section title do appear in the TOC on Cancer.gov.
Example:
Peutz-Jeghers syndrome [summary module] - CDR738176
included in
Genetics of Colorectal Cancer [summary] - CDR62863
This report was created before we started using modules. The report takes a summary and scans through the content to find Title elements for SummarySection without resolving any links. We will have to filter the summary document and include the content of the module prior to scanning the XML for the SummarySection titles or mark the position of the SummaryModuleLink in the TOC since the TOC for the module itself does get created.
The reason why the titles are listed on Cancer.gov is that Cancer.gov doesn't get the content of the module as a separate document but included (denormalized) in the summary document itself. Cancer.gov doesn't even know that these two documents were once separated.
Assigning this to Alan per our discussion.
I have a quick question: There are options to display the TOC for all summaries (of a specific type) or for a single document. Are both of these options used? I would assume that running the report for a single document to be a rate event but I could be wrong.
Both options are used. We actually use the TOC for a single summary more often than we use the option for all summaries. We check this report quite a bit when revising/reorganizing a summary to make sure all of the sections are nested appropriately.
I installed a new version of the program in DEV and version control with two changes.
First, it denormalizes SummaryModuleLinks, as requested. The display of modules is done inline with no indication that the data came from a module. It's as if it were part of the parent document. If that isn't satisfactory, we'll have to make some more changes.
The second change has to do with the way insertion and deletion markup is handled. I haven't gone through the history of the program to find out why this happened, but two different methods were used to resolve insertion and deletion markup, a relatively simple method and a more complicated method. In most cases the two produce the same results. However if there are nested insertions or deletions, the more complicated method was sensitive to that and could, I think, produce a more accurate result.
I changed the program to use the more complicated method everywhere.
There is now more processing involved than before, both to denormalize the Summaries and to use the more complicated method for resolving insertion/deletion markup. I don't think the time penalty will be any problem at all for single summaries (as per Volker's question above) but might be noticeable on big, multi-summary reports.
It's ready for testing on DEV.
I just did a timing check on all English, Health Professional, Adult Treatment Summaries.
The old method used around 29 seconds. The new method 47 seconds.
I can think of ways to optimize this and get closer to the 29 seconds, but not easily and it would be a lot of work. I'm thinking that it's not worth adding the programming effort and software complication.
Verified on DEV. Thank you!
I don't think the timing is that bad. We often run the report for a single summary; I ran it for the behemeth Genetics of breast and ovarian cancer and it came up within a few seconds. All Adult Tx summaries (with all levels) came in at 44 seconds.
I'm marking it as resolved fixed and I'll tag it as awaiting release.
Verified on QA.
Verified on stage.
Verified on prod.
Elapsed: 0:00:00.001513