CDR Tickets

Issue Number 5111
Summary SVPC summaries validated against partner DTD
Created 2022-05-23 09:10:06
Issue Type Bug
Submitted By Kline, Bob (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2022-05-23 12:08:13
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.318783
Description

The new logic to drop the SVPC documents from the set pushed to the sFTP server was added to the sftp-export-data.py publishing script. What I didn't realize when I made that change was that the CG2Public.py script, which runs before {}sftp-export-data.py{}, is validating all of the exported documents against the partner DTD. If there are any validation errors, {}CG2Public.py{},

  1. logs the validation errors

  2. exits with an error code (instead of 0, the success code), which triggers an email notification

  3. skips its last processing steps, which were to copy the DTD and the media catalog to the "licensee" directory.

As far as I can tell, failure to copy the DTD is not really a problem, because it does not appear that sftp-export-data.py includes the DTD in the files copied to the sFTP server. However, the media catalog ({}media_catalog.txt{}) is supposed to be copied to the sFTP server, both by itself, as well as in the full.tar.gz file. I have pushed that file to the sFTP server, where I added it to the full directory and to the full.tar.gz file. We'll probably want to do this by hand until the fix for this bug is deployed, unless knows that this file is not used by the data partners (a pretty unlikely thing for him to know for sure).

The fix is to move the logic to skip the SVPC files from sftp-export-data.py into {}CG2Public.py{}.

Comment entered 2022-05-23 10:16:01 by Kline, Bob (NIH/NCI) [C]
cd /sftp/sftphome/cdrstaging/pdq-prod
gunzip full.tar.gz
cd full
tar --numeric-owner --owner=0 --group=0 -rf ../full.tar media_catalog.txt
cd ..
gzip full.tar
Comment entered 2022-05-23 12:08:13 by Kline, Bob (NIH/NCI) [C]

I have committed the fix.

60383eb Don't validate CMS-only docs against partner DTD (OCECDR-5111)

I would like to test this (as well as the fix for OCECDR-5108) but first we'll need to apply Oersted to CDR DEV, and I'd rather not do that without first consulting with him when he's back from vacation.

Comment entered 2022-05-23 12:09:29 by Kline, Bob (NIH/NCI) [C]

One other possible approach is—if we decide Ohm is close enough to ready for this step—we deploy Ohm to QA and Volker tests there.

Comment entered 2022-06-08 15:01:28 by Englisch, Volker (NIH/NCI) [C]

I successfully ran the publishing job on DEV and can confirm that the SVPC documents were skipped and did not cause a validation error anymore.  The publishing job still had some validation errors but these are related to incorrectly formatted test data.  I fixed those errors and I'm running another publishing job but for the purpose of this ticket the publishing job ran OK.

Comment entered 2022-06-08 17:22:08 by Kline, Bob (NIH/NCI) [C]

OK, thanks. I'm moving this one into "Task Reviewed (DEV Verified)" then.

Comment entered 2022-06-08 18:21:27 by Englisch, Volker (NIH/NCI) [C]

Just to follow up, the latest publishing job on DEV finished without any validation warnings.  All is better again!

Comment entered 2022-10-10 13:49:02 by Englisch, Volker (NIH/NCI) [C]

The SVPC documents are correctly ignored from validation on PROD when running the CG2Public.py script.

Closing ticket.

Elapsed: 0:00:00.001344