Issue Number | 5105 |
---|---|
Summary | Possible report to generate counts of various PDQ content |
Created | 2022-03-18 12:45:58 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2023-03-14 15:41:05 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.313461 |
I often pull together counts of published PDQ content of various types (see attached PPT slide as an example). We have various lists reports already, and the number of DoCT terms is on Cancer.gov, but it would helpful to have a single report that could be generated so I don't have to look in a bunch of different places. Let's discuss.
Robin will add specific criteria for any rows on the report which aren't obvious (particularly the media counts). Bob will then come up with an ad-hoc report and we'll take it from there.
I think we discussed that this would likely be an ad-hoc report, but here are the numbers I'm interested in getting and where I think they could come from:
PDQ English HP summaries
Summaries Lists report - please use Summaries & Modules selection
PDQ Spanish HP summaries
Summaries Lists report - please use Summaries & Modules selection
PDQ English patient summaries
Summaries Lists report - please use Summaries & Modules selection
PDQ Spanish patient summaries
Summaries Lists report - please use Summaries & Modules selection
PDQ English SVPC summaries
Summaries Lists report
PDQ Spanish SVPC summaries
Summaries Lists report
Drug information summaries (only available in English)
Drug Info Summaries Lists report - please combine total of single agent summaries & combination summaries
NCI Dictionary of Cancer Terms in English
I currently get this from Cancer.gov - I'd like the number of published glossary terms rather than the number of definitions
NCI Dictionary of Cancer Terms in Spanish
I currently get this from Cancer.gov - I'd like the number of published glossary terms rather than the number of definitions
NCI Dictionary of Genetics Terms in English
I currently get this from Cancer.gov - I'd like the number of published glossary terms rather than the number of definitions
NCI Dictionary of Genetics Terms in Spanish
I currently get this from Cancer.gov - I'd like the number of published glossary terms rather than the number of definitions
NCI Drug Dictionary Terms (only available in English)
I currently get this from Cancer.gov by manually counting the number of terms for each letter. I don't think we have an existing report to generate this statistic but I think we could get this number by identifying publishable Term documents that have a definition block and a semantic type of drug/agent.
Biomedical Images and Animations
Media Lists report - this should combine the number of images + videos (unless we start adding a bunch of other types of videos, this seems to make the most sense - we have just 3 animations) BUT I'd like to exclude images that we're reusing from journals or something like that. To do that I think you could exclude any images that have a Permission Information block (see CDR788250 as an example)
I'm sure there will be questions, so just let me know as those come up. Thank you!!
~juther do you want the counts restricted to documents which have actually been published?
Edit: answering my own question: I guess not, since you've asked for
modules to be included, and they won't show up in the
pub_proc_cg
table.
Second edit: I guess it's not as simple as that, as I see that for the glossary terms you want the published documents.
~juther Are we going back to using "PDQ" when referring to the SVPC summaries, then?
Report attached.
I already love this report. 🙂 It will be a big time saver for me.
Â
In response to your first question, I'd like this to reflect publishable documents for all categories. I see your point about summaries and modules, but if possible I'd like to include only the publishable modules (those that are published as both a module and a standalone summary – in other words, they have a "yes" value for the AvailableAsModule attribute but no value for the ModuleOnly attribute).
Â
In response to your second question, yes, that was a mistake to include "PDQ" at the beginning of the SVPC summary items. Please remove PDQ from those rows. Thanks!
Â
One additional request:Â
for biomedical images and animations, is it possible to break that down by language?
A while ago I tried to explain that we were inviting confusion by our use of the unqualified word "module" to refer to different things at different times. I don't think I explained the problem very well, as the response I got was mostly along the lines of "we always know what we mean." This ticket provides a pretty good example of the confusion I was ineptly trying to warn about. For the Summaries Lists report, I was explicitly told that for the context of that report the word "module" referred to a document which could not be published separately. So when your original requirements for this ticket asked for using the Summaries and Modules selection logic of the Summaries Lists report, I created queries for this new report which also include documents which can only be used as modules, just as the Summaries Lists report does. However from your most recent comment, I can see that this was not what you really wanted, and you don't want the "modules" (as defined for that older report). I'll rewrite the queries. 😉
As for splitting the media by language, can I assume that I should
use the presence of a TranslationOf
element to indicate
Spanish, and the absence of that element to mean English? (The
@language
attributes sprinkled around in those documents
are kind of a muddle, and therefore unreliable, as they contradict each
other; see, for example, the discussion in OCECDR-5095.) I'll wait for
your answer to this question before proceeding with making the requested
changes.
If the Media documents really are all language-specific, as William
says (and as your request implies), I have to wonder why we don't have a
required Language
element or attribute at the top level of
the document, as we do for Summary documents. 😛
One more clarification. The original requirements asked that the
report include only "published" glossary terms, but your most recent
comment says "I'd like this to reflect publishable documents for all
categories." I just want to confirm that you want me to remove the
restriction to glossary documents which show up in the
pub_proc_cg
table (that is, they're actually available to
the web site and the data partners) and only make sure that a
publishable version exists and the document isn't blocked. Won't make a
huge difference most of the time, but there can be a gap.
Let's discuss these questions/clarifications in our status meeting shortly.
We discussed adding this report to the menus. Please add it to the OCC Board Managers page, under PCIB Management Reports. It could be #3: PDQ Content Counts.
Â
Thank you!
All the enhancements have been implemented, and the report is installed (on DEV, though the data comes from PROD) on the admin menu as requested. I've got menu entries for both HTML and Excel. If you'd prefer to have just one format on the menu, let me know which one and I'll remove the other.
Fine to have both options. Looks great on DEV!
The HTML version of this report looks good on QA, but I'm getting a "Failed - Disk Full" message when I try to run the Excel version. Is it just me?
I just tried it without any problems. And looking at the QA server, all the disks seem to have plenty of free space. Want to do a screen share?
Turned out to be a problem with disk space on the virtual machine. Once that resolved, I was able to verify this on QA. Thanks!
Verified on PROD. Thanks!
File Name | Posted | User |
---|---|---|
PDQ Content Counts.xlsx | 2023-03-14 15:39:42 | Kline, Bob (NIH/NCI) [C] |
What is PDQ Now - Statistics.pptx | 2022-03-18 12:45:55 | Juthe, Robin (NIH/NCI) [E] |
Elapsed: 0:00:00.001678