Issue Number | 4780 |
---|---|
Summary | [Glossary] Health Professional Glossary Terms Report |
Created | 2020-02-13 09:35:15 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2020-05-07 17:27:05 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.256679 |
We would like to create a new report to view health professional dictionary terms with or without their definitions. I'm attaching specifications for the new report.
Dictionary: Genetics OR None
Just to confirm: None
means only display terms which
have no dictionary, right?
That's right. No dictionary. Thanks.
If "could look like the GTC by type report" from the specification means we need to implement rich text mixed formatting in the definition cells AND we need to support an Excel version of the report (unlike the GTC by Type report) then this will be at least a 20. Can you confirm?
What do you mean by rich text? Filling in the placeholders? Please clarify.
I'm fine with the report only being in HTML for now and we could add Excel down the road if it's needed. You make a good point that we don't have this for the GTC by Type report so I don't think it's necessary. Not sure how much that saves in terms of LOE.
What do you mean by rich text?
I mean more than one font style (color, size, weight, font family, etc.) applied individually to parts of a cell.
That's not supported by the common framework we have built for Excel
reports (because it's not supported by the underlying
openpyxl
library), so we'd have to pull in another, more
specialized Excel library and build the report by hand. We can do it
(two other reports were implemented that way), but it's definitely more
work.
I'm adding some more general notes for the development team on Excel support in the CDR, as questions about this topic have come up in the past, and I don't always remember the details myself.
Home-grown libraries for reading and writing Excel workbooks have been completely retired. Those libraries were developed back in the days when there were no suitable third-party libraries available.
We still have some scripts which use the older xlrd
and xlwt
libraries.
The xlwt
library is only able to generate spreadsheet files
compatible with Microsoft Excel versions 95 to 2003 (Excel
97/2000/XP/2003 XLS), and cannot create modern Excel workbooks. The
xlrd
package is a companion to xlwt
and is
able to read both .xls and the newer .xlsx files.
The openpyxl
package is currently our primary tool for reading and writing modern
Excel files. Its two limitations are (a) it does not support old Excel
.xls files; and (b) it does not yet support rich text in cells (though
there is a pull
request in the pipeline for such support). Because of the first
limitation, we should keep the xlrd
package installed, even
after all use of it in our existing scripts has been replaced with
openpyxl
. The openpyxl
package is well
supported, and it is currently the most widely-used package for working
with modern Excel files. Our own report
framework provides a wrapper around it to facilitate creating Excel
versions of the reports, and all new software (and extensive rewrites of
existing scripts and libraries) should use this package (with our report
framework, if feasible).
The xlsxwriter
package does not support reading Excel workbooks, but it does supply the
rich text support which openpyxl
currently lacks. There are
currently two reports (Media
Caption and Content Report and Summary
Standard Wording) in the CDR which use this package to meet the
requirement for rich text in Excel report cells.
~juther: need some more
clarification. I assume "terms" in the phrase "blocked terms" (in the
requirements attachment) refers to the GlossaryTermName
documents and not the GlossaryTermConcept
documents. I base
this assumption on two things:
In the GTC By Type report it's the GTN documents which are labeled BLOCKED rather than the GTC document.
The word "terms" is used elsewhere (without any qualification) to refer to the term names, not the term concepts (e.g., "List of Terms").
(It's ironic, given that dictionaries are all about eliminating ambiguities in the meaning of words, that the words we use to identify things for the glossary documents are packed with so much confusing overloading, don't you think? 😛)
So here's my question: if the user chooses to exclude blocked terms and a concept has some blocked and some unblocked name documents, I would guess that for the Concept flavor of the report we'll show the concept but only show the unblocked names. What if all of the concept's name documents are blocked? Do we skip the concept altogether for the Concept version of the report (as we will for the List of Terms flavor)? Or do we show the concept but use a blank cell for the second column?
Hmm. I posted a longish comment to this ticket late in the day yesterday, to give you an FYI that in anticipation of creating this new report based on the GTC By Type report, I had reworked the latter report so I wouldn't be replicating the old techniques of manually assembling strings for the report's HTML into the new report. I even had screen shots to show how much faster the rewritten GTC By Type report was, and oddly enough, the screen shots I had pasted into the comment appear to have survived, but Jira lost the comment itself. So here's a less long-winded version of that comment. 🙂
While I'm soliciting clarification of the requirements: I see that
TermType
is unbounded for the GTC docs. So if the
user says exclude LOE terms, and a GTC has a term type of Level of
evidence as well as one or more other term types, we still exclude
the concept, right?
Installed on the Glossary Terms Reports menu page on DEV. Please report any bugs you find here on this ticket. Please submit any modifications to the original requirements as new tickets in Maxwell.
Looks good on DEV. (Amy reviewed this on DEV and said it worked as expected).
Yes, by terms I meant term names. 🙂 I also like how you've handled the display of blocked terms - it appears you are still displaying blocked term names on the concept version of the report (even if all terms for the concept are blocked), including the blocked (or not used anyway) definition. It looks good to me as you have it. Thanks.
This looks good to me too although I noticed one small thing:
At the top of the Genetics Dictionary version of the report, it says "Glossary Dictionary". I think you meant to say "Genetics Dictionary". Thanks!
Indeed. Fixed. (All those G words!)
Verified on DEV. Thanks!
Verified on QA. Thanks!
Verified on PROD.
File Name | Posted | User |
---|---|---|
Health Professional Glossary Terms Report.docx | 2020-02-13 09:36:34 | Juthe, Robin (NIH/NCI) [E] |
image-2020-04-29-08-10-04-271.png | 2020-04-29 08:10:04 | Kline, Bob (NIH/NCI) [C] |
image-2020-04-30-18-45-18-133.png | 2020-04-30 18:45:18 | Kline, Bob (NIH/NCI) [C] |
image-2020-04-30-18-45-53-018.png | 2020-04-30 18:45:53 | Kline, Bob (NIH/NCI) [C] |
Elapsed: 0:00:00.001863