CDR Tickets

Issue Number 5343
Summary [Storefront] Create report to identify which Drupal content needs to be updated in Storefront
Created 2024-11-07 14:31:19
Issue Type Improvement
Submitted By Juthe, Robin (NIH/NCI) [E]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Resolved
Resolved 2024-12-24 07:48:26
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.475204

We would like to catch up our content in Storefront to remove pages that we have since taken down from and add pages we have added to (such as the modernized patient-focused content).

Comment entered 2024-11-25 10:46:09 by Kline, Bob (NIH/NCI) [C]

There are 741 nodes in Drupal with the syndication flag turned on. My understanding is that even though that field is only stored with the English version of the node, it applies to all languages of each node, which total 1,348 pages marked for syndication. The HHS Storefront has 1,204 items with NCI as the source. I have attached an Excel workbook with three tables.

  • Missing From HHS Storefront (Drupal has it, HHS doesn't have it)—305 URLs

  • Not On Drupal (HHS has it, Drupal doesn't—or it is no longer flagged for syndication)—161 URLs

  • Dead URLs On Storefront (HHS has a link we gave them which no longer retrieves our content)—154 rows

In one case for the dead URLs table, the response came back with a 200 (OK) HTTP code, but the payload was HTML saying the content wasn't there any more (in this case, a YouTube video).

This is a starting point. I'm still digging into how to generate the report in a way that doesn't involve me logging into the production Drupal server. Also, you may very likely want to narrow the scope of the report (though I assume it would be useful for somebody to know that a URL we gave HHS is now dead, even if that URL is for something other than a page on—such as a YouTube video).

Let me know what changes, if any, would make this report more useful. There's a lot of duplication between the last two tables, but the second table will have URLs we don't have in Drupal because they link somewhere else (like or YouTube). You'd need to go to the third table to find out which of those are no longer working links.

Comment entered 2024-12-05 14:56:57 by Juthe, Robin (NIH/NCI) [E]

Thanks, ! As discussed in the status meeting, please add a column on the far right to include the Description for everything on the Missing from HHS Storefront tab.

Comment entered 2024-12-05 16:38:36 by Englisch, Volker (NIH/NCI) [C]

After looking at a few samples, it appears that Bob's list of "Dead URL's on Storefront" is correct.

I can confirm that all of the documents listed on the "Health Report" - which shows the unhealthy documents - of the storefront are included on Bob's report.  I followed several of the documents (Fatigue, Nutrition in Cancer Care, Childhood Brain Stem Glioma Treatment, for instance) not on the "Health Report" and confirmed that they result in the "Not Found" page.  

It would be interesting to know why the Health Report isn't giving us a full picture but since nobody would fix the report anyway we should just go with what we have (and what we can confirm to be correct).  It is also surprising to me that these missing documents are still served up by the Storefront.  The Storefront provides the documents from its own data storage.  It is possible that the "Health Report" and/or the process to update documents hasn't run in a very long time to preserve the status quo.

What this exercise is showing us is that the problem of missing documents in the Storefront is much worse that what I expected.

Comment entered 2024-12-05 17:48:35 by Kline, Bob (NIH/NCI) [C]

So we won't be keeping the descriptions in sync for the ones they already have?

Comment entered 2024-12-24 07:48:26 by Kline, Bob (NIH/NCI) [C]

As requested, here is the report with the page descriptions included.

Comment entered 2024-12-24 07:50:01 by Kline, Bob (NIH/NCI) [C]

Didn't get an answer, so I went ahead and added the description column to both of the first two sheets.

File Name Posted User
hhs-storefront-report-20241125.xlsx 2024-11-25 10:17:31 Kline, Bob (NIH/NCI) [C]
hhs-storefront-report-20241224.xlsx 2024-12-24 07:47:46 Kline, Bob (NIH/NCI) [C]

Elapsed: 0:00:00.000393