CDR Tickets

Issue Number 4948
Summary [Internal] Make Emergency Timeout Configurable
Created 2021-02-24 12:55:39
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2021-05-13 14:51:36
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.285528
Description

On occasion our publishing job looses database connectivity and stops processing documents but it continues to run.  For these situations we implemented a "runaway job cancellation".  The amount of time we wait until we declare the job failed is set to 3 hours which works well for the weekend job.  For the nightly job which finishes in about 15-25 minutes I'd like to be able to interrupt the job sooner. 

Bob suggested to make modifying this timeout configurable.

Comment entered 2021-05-07 19:59:20 by Kline, Bob (NIH/NCI) [C]

Ready to be tried out on DEV. You can set the Publishing group value with the name "-wait_seconds" prefixed by the publishing subset name. For example:

 

Comment entered 2021-05-10 12:42:09 by Englisch, Volker (NIH/NCI) [C]

I'm wondering why the Name values is "Interim-Export-wait_seconds" and not "Interim-Export-wait-seconds"?  (dash vs. underscore)

Comment entered 2021-05-10 13:24:17 by Englisch, Volker (NIH/NCI) [C]

I've adjusted the setting for the publishing job to fail after 2 minutes and it successfully cancelled the job.

Works as expected!

Comment entered 2021-05-13 13:36:04 by Kline, Bob (NIH/NCI) [C]

I will change the name so that it uses hyphens instead of underscores.

Comment entered 2021-05-13 14:51:36 by Kline, Bob (NIH/NCI) [C]

I swapped out the underscore for a hyphen in the control value name.

Comment entered 2021-05-27 12:49:33 by Englisch, Volker (NIH/NCI) [C]

I can't find the new option Interim-Export-wait-seconds on QA to test this change.

Comment entered 2021-05-27 13:04:14 by Kline, Bob (NIH/NCI) [C]

It only exists if you create it. That's why it's called an option. 😛

Comment entered 2021-05-27 15:13:51 by Englisch, Volker (NIH/NCI) [C]

The publishing job fails based on the specified timeout value.

Comment entered 2021-06-16 15:44:59 by Englisch, Volker (NIH/NCI) [C]

I've modified the nightly publishing export job to cancel execution after 60 minutes if not done.  I will close this ticket tomorrow after the job finished successfully but I won't wait until we experience the first "runaway job" for which this ticket had been created.

Comment entered 2021-06-17 12:33:46 by Englisch, Volker (NIH/NCI) [C]

The publishing job finished successfully last night.  Closing ticket.

Attachments
File Name Posted User
image-2021-05-07-19-59-07-713.png 2021-05-07 19:59:08 Kline, Bob (NIH/NCI) [C]

Elapsed: 0:00:00.001454