CDR Tickets

Issue Number 5301
Summary pdqdocs filter out of date
Created 2023-11-17 15:17:08
Issue Type Bug
Submitted By Kline, Bob (NIH/NCI) [C]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status QA Verified
Resolved 2023-11-28 19:36:07
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.368410
Description

It looks like the XSL/T filter in /sftp/sftphome/cdrstaging/pdq-prod/docs on the sFTP server does not match what's in the source code control repository (in fact, it hasn't been refreshed since March of 2017). However, even the latest version of the filter in the riemann branch is no longer in sync with the current summary document structure, as some bits are missing in the filtered output. Note in this screenshot that the title and CDR ID are both missing when running show-PDQ-summary.py on CDR62824 (on DEV).

Comment entered 2023-11-17 16:54:00 by Kline, Bob (NIH/NCI) [C]
Comment entered 2023-11-27 17:26:47 by Englisch, Volker (NIH/NCI) [C]

I am not clear what the goal is for this ticket.  You are referring to the XSLT filter sample on the sFTP server but then are using the "show-PDQ-summary.py" script on the CDR server which isn't used by the partners. Additionally, the script is loading a document from the database when it should be using the sample document.

I did download the sample XML and XSLT documents from the sFTP server and the transformed HTML document didn't show those issues you're referring to.  Here is the screenshot:

Of course, the XSLT and XML documents are fairly dated but they still serve their purpose:  Showing our partners how to transform a summary document.

Do we want to refresh the sample documents for the partners or do we want to fix the script show-PDQ-summary.py?

Comment entered 2023-11-27 19:39:14 by Englisch, Volker (NIH/NCI) [C]

The reason why your test didn't display the summary title is a result of the test using the published version of the document.  However, the XSLT filter expects to process a summary title that includes the string "(PDQ (R))".  That string is only included in the partner XML and therefore, splitting the summary title in order to insert the "<sup/>" tag for the registered trademark symbol removes the actual summary title.

Comment entered 2023-11-28 08:53:39 by Kline, Bob (NIH/NCI) [C]

I am not clear what the goal is for this ticket.

The goal is to have what's in source code control match what we give the partners. I thought there was a second problem, but as you pointed out, that second problem was in my head. 😛

Comment entered 2023-11-28 19:35:49 by Englisch, Volker (NIH/NCI) [C]

I've made some changes to the filter to prevent the deletion of the SummaryTitle so that the filter works properly with published XML documents as well as partner XML output. At the same time I cleaned up the code a bit, removed sections that were commented out and removed code around unused/outdated parameters.  I also updated the location for our image files so that images are displayed again.

The following files have been updated in the pdqdocs directory:

  • PDQ-summary.xml

  • PDQ-summary.xsl

  • PDQ-summary.html

https://github.com/NCIOCPL/cdr-publishing/commit/eece76f

Note: Copying these files to the sFTP server is a manual task.

Comment entered 2023-11-28 21:33:32 by Kline, Bob (NIH/NCI) [C]

Comment entered 2024-12-18 11:50:36 by Englisch, Volker (NIH/NCI) [C]

There is no directory "D:\cdr\pdqdocs" on QA or STAGE.  This is where the PDQ partner docs live on DEV.

Clicking the link on QA results in an error:

Error reading file 'D:/cdr/pdqdocs/PDQ-summary.xsl': failed to load external entity "D:/cdr/pdqdocs/PDQ-summary.xsl"

Comment entered 2024-12-18 12:24:33 by Kline, Bob (NIH/NCI) [C]

I'm pretty sure that directory has never been copied by the cdr-deploy.py script (going all the way back to Alan's original code), since where it really lives is on the s/FTP server. If they're on the Windows server, it's very likely they were put there by (your) hand. If we want to change that, let's create a Sanger ticket.

Comment entered 2024-12-18 13:04:21 by Englisch, Volker (NIH/NCI) [C]

You are probably right.  In that case we should probably add code to the script to display a warning on QA and STAGE indicating that the script is only supposed to be used on DEV.  People who don't have the memory of an elephant (me, for instance) are easily confused.  When I'm supposed to test something on QA and it behaves differently than DEV I immediately enter panic mode. 🙂 

Also, you may have seen the note below that the copy of these files is a manual task - not part of the deployment script.

I agree with you now.  We should copy all files to the sFTP server but not but we cannot replace the pdqdocs directory.  The DTD file is part of the pdqdocs directory on the sFTP server but it lives in a different directory on our systems. We may want to change that portion in the future.

For now, let's copy the files from the pdqdocs directory in GitHub to the pdqdocs directory on the sFTP server.

Attachments
File Name Posted User
image-2023-11-17-15-16-20-789.png 2023-11-17 15:16:21 Kline, Bob (NIH/NCI) [C]
image-2023-11-28-21-32-04-144.png 2023-11-28 21:32:04 Kline, Bob (NIH/NCI) [C]
Screenshot 2023-11-27 at 5.15.53 PM.png 2023-11-27 17:17:03 Englisch, Volker (NIH/NCI) [C]

Elapsed: 0:00:00.001235