CDR Tickets

Issue Number 4288
Summary Push Verification Service Changes
Created 2017-07-20 14:06:00
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2018-03-13 09:25:51
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.211795
Description

Due to the Feynman streamlining changes, the push verification job may now check against a Gatekeeper server (on the lower tiers) that didn't receive the XML data files. There is currently a temporary fix in place and this ticket should setup a permanent fix.

Comment entered 2017-07-20 14:08:06 by Englisch, Volker (NIH/NCI) [C]

From an email from Bob explaining the problem, the suggested permanent fix, and the reason why he has been tagged to implement the fix:

From: Bob Kline bkline@rksystems.com
Sent: Monday, June 26, 2017 2:33 PM
To: Englisch, Volker (NIH/NCI) [C] <volker@mail.nih.gov>
Subject: Re: Push verification service failing on DEV

I figured it out. Our technique for modifying cdr2gk.host worked fine in a world where a script runs, does its thing, and shuts down. Not so much for a world in which the cdr2gk module is loaded by a service which runs for arbitrarily long stretches (possibly weeks or months), handling lots of jobs. If a job doesn't override the default GK host, we just use what the module has. Unfortunately, if the previous job changed what the module had, the host name won't be the default any more. I have modified the check to preserve and restore the original value. When I restarted the scheduler with the modified code, the verify job ran successfully the first time.

My modification will work as a temporary fix, but it's not good enough: it's possible for another job running under the scheduler to kick off while the verify job is running. The silver lining is this is a problem only on the lower tiers. But it's still something we want to fix properly, which means leaving the default cdr2gk.host value at the default pulled from the cdrapphosts file, and modifying the cdr2gk code to pass the host value on the stack everywhere, treating the global host variable as a read-only constant once it's set, and then modifying all the code that uses the cdr2gk module to pass in a host parameter when it needs to override the default.

Whichever one gets around to putting in the JIRA ticket first can assign the work to the other guy. :-)

Bob

Comment entered 2018-03-13 09:25:51 by Kline, Bob (NIH/NCI) [C]

This was fixed by Gauss.

Comment entered 2018-04-10 17:19:27 by Englisch, Volker (NIH/NCI) [C]

I was unable to trick the system. I ran two publishing jobs in parallel, one pushing documents to QA and one pushing to DT. Both jobs finished successfully and verified against the appropriate GK server.

Comment entered 2018-05-10 19:13:11 by Englisch, Volker (NIH/NCI) [C]

This particular problem only affected us on the lower tiers where we switch and point to different servers during testing. We never switch servers on PROD.
Our testing we done on the QA server was sufficient. Closing ticket.

Elapsed: 0:00:00.001387