Issue Number | 5143 |
---|---|
Summary | [GovDelivery] Modify GovDelivery reports for English and Spanish to show subsections with changes |
Created | 2022-09-22 12:22:06 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2022-09-26 06:27:55 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.328165 |
As discussed, we would like to modify the weekly English and Spanish GovDelivery reports of new/changed summaries to add a third column to the New and Revised Health Professional Summaries tables displaying the titles of the subsections that include changes highlighted in the changes section.
To populate this new column (which can be named "Section(s)") with data from the last publishable version of the summary, the software should:
locate the "Changes to summary" section (from the section metadata)
Identify text that is in both strong & para tags (our convention is to use strong & para tags to represent headings of sections containing changes. If we used section tags, these would show up as sections in the table of contents. I am sure this is very consistently applied.)
Capture this text and display it on the report
If there isn't any text in the Changes to summary section that fits #2 above (strong & para tags), please display the complete string of text in the Changes to summary section.
I will attach an example.
If possible, we would like to complete this change by Nov 1.
What would be the purpose of adding a column to show "changes" to a new summary?
This will be fodder for another ticket, but I noticed that the anchor links to specific sections appears to have been broken by a change in the structure of the link targets' IDs.
Here are samples. Still need an answer to my question about why it would be appropriate to add a column showing changes from previous versions for summaries which are completely new.
I noticed that the "Changes to This Summary (MM/DD?YYYY)" title is displayed when there aren't any "Para/Strong" section titles but left out if there are. Not sure if that was part of the requirements.
If there are no Para/Strong elements the requirement call for picking up all of the text content of the changes SummarySection.
I'm not talking about which of the text content is picked up but about the SummarySection/Title text. That's included for the one group but not the other.
If you ask the parser for the text content of a block in an XML
document, you get the text content of all of the elements in that block.
The Title
element is one of the elements in that block, and
it has text content, ergo ....
Note the word "complete" in item 4 of the requirements. And note that item 4 only applies if item #2 doesn't find anything.
So, this was done on purpose. Ok then! I didn't read the requirements this way because I didn't think Robin is fluent in parser speak. 🙂
I'd be very surprised if she is. My explanation in those terms was for your benefit. She's more likely to say "I want all the text in the section block" (which is what the ticket asks for). Doesn't seem like an unreasonable approach. If her thinking is "I expect that you'll usually find Para/Strong elements (note that she says this markup is used very consistently), but in the rare case where you don't find any, dump all the text from the section block into the cell so we can pick out what we need" then I think that's what we're doing. Of course, she may change her mind and decide she only wants some of the text from the section block in these rare cases, and provide a list of which elements she wants us to use (or exclude, if she prefers to do it that way), in which case we'll modify the software again. Not a big deal, nor hard to do. That's why we test on DEV. But I wouldn't be surprised if she really decided it's safer in the "didn't find what we expected" cases to just get a dump of all the text rather than try to come up with a list which might risk missing some wanted elements. It might even be that the real surprise is that the "didn't find what we were looking for" cases are not as rare as she expected they'd be, and having a dump of everything we did find is the best way to track down why it's happening and fix the problem. 🙂
Good point. We don't need the additional column for new summaries. Thank you!
You are correct, ~volker , I am definitely not fluent in parser speak. 🙂
That being said, I think we could exclude anything in the following elements:
Section Metadata
Title
first Para (this is going to seem fragile, but we have a standard statement in each of the Changes sections that reads "The PDQ cancer information summaries are reviewed regularly and updated as new information becomes available. This section describes the latest changes made to this summary as of the date above.")
An alternative to the "first para" approach could be to look for this matching string and if it appears, remove it. If it doesn't appear, display whatever text shows up within a para element in this section.
The samples look great, but I have another request (or two) if possible.
In the GovDelivery email, we are going to be sorting the summaries by Board/topic into the following categories:
Adult Treatment
Pediatric Treatment
Supportive and Palliative Care
Screening and Prevention
Cancer Genetics
Integrative, Alternative, and Complementary Therapies
1. Could we add another column to the New and Revised HP Summaries sections of the report to include the name of the Editorial Board? While not an exact match, I think that's likely to provide the most useful information for Kim to be able to sort the content appropriately.
2. If we could also sort the summaries by Board name, and then alphabetically by summary title, that would be ideal.
Thank you!
Version 3 (I had just finished version 2 when the latest set of requirements rolled in):
https://cdr-dev.cancer.gov/gd-english-20220930T073450.html
https://cdr-dev.cancer.gov/gd-spanish-20220930T073455.html
I assumed that when you asked that the summaries be sorted by board name and then by title, that you were referring only to the summaries for which we are actually displaying the board name. If that assumption is wrong, let me know (though I think it would look to a report viewer who wasn't in on the details of the requirements as if I had just forgotten to do any sorting at all for the other tables if I'm sorting by values she can't see).
These look great, Bob!
I'm not 100% I understand your assumption, but I think you are correct since everything looks as I expect it to 🙂 - I only expected the HP summary tables to have the Board column, if that's what you meant.
Would it be possible to run this report for a broader date range so I can see some new summaries in the table, just to confirm that display is also showing the Board name appropriately? Thank you!
I only expected the HP summary tables to have the Board column, if that's what you meant.
Yes, that's what I meant. I didn't really want to take "also sort the summaries by Board name" too literally.
What date range would you like? I'll have them mailed to you so you can see what the reports will actually look like as email messages.
Maybe we should try the past 6 months or so? 2022-04-01 through 2022-09-30? Thanks!
One their way.
Got 'em and they look great! Thank you.
Hi ~bkline - just wanted to confirm that we're good to go with the switch to using these new reports for English and Spanish this weekend. Thank you!
Thanks, ~juther. Did you see my email message from the this morning?
Good morning, Robin. I have swapped in the new version of the GovDelivery report to be run this weekend. Can you confirm whom I should check with to verify that it has what she needs? Is it Kim Reyes?
Commits:
Doesn't need additional testing. This has been working with production data since November 2022.
Verified on PROD. Thanks!
File Name | Posted | User |
---|---|---|
Sample GovDelivery - New Summary Column to Show Sections.docx | 2022-09-22 12:35:08 | Juthe, Robin (NIH/NCI) [E] |
Elapsed: 0:00:00.001501