Issue Number | 4915 |
---|---|
Summary | [Glossary] Consider joining the CDR and Cancer.gov Glossifiers |
Created | 2020-10-22 13:31:17 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Open |
Resolved | |
Resolution | |
Path | /home/bkline/backups/jira/ocecdr/issue.277130 |
There are currently two separate glossifiers for CDR users and Drupal/CMS users. Bob outlined several differences (below) in OCECDR-4625. This issue is to examine whether there need to be two separate glossifiers and/or as many differences between the two.
_____
Here are the deltas between the two glossifiers ("CDR" refers to the glossifier used inside XMetaL; "CMS" refers to the glossifier information exported by the CDR for use in the CMS):
Feature |
CDR |
CMS |
---|---|---|
Includes unpublished terms |
✔ |
❌ |
Normalizes RIGHT SINGLE QUOTATION MARK (U+2019) to APOSTROPHE (U+0027) |
❌ |
✔ |
Strips punctuation |
✔ |
❌ |
Preserves the original term for display in the user interface |
✔ |
❌ |
Uses the |
✔ |
❌ |
Includes dictionary information |
❌ |
✔ |
Parses the publishable document versions |
❌ |
✔ |
Replaces hyphens with spaces |
❌ |
✔ |
Normalizes whitespace |
✔ |
✔ |
Preserves diacritics |
✔ |
✔ |
Collects and stores the glossary information nightly |
❌ |
✔ |
Assembles the glossary information on demand |
✔ |
❌ |
Includes terms without definitions |
✔ |
❌ |
It's possible that I have overlooked some other differences, but I'm pretty confident I've found them all. As far as I know, the requirements for these two glossifiers were established completely independently from each other, so it doesn't seem surprising that there is so much divergence in their behavior. Because of this, it would not seem to make sense, in my judgment, to refactor the common functionality into a single library in the CDR unless the requirements for the two glossifiers were brought more in line with each other.
Elapsed: 0:00:00.001799