Issue Number | 5314 |
---|---|
Summary | Create health check monitor for CDR production server |
Created | 2023-12-26 17:49:33 |
Issue Type | Improvement |
Submitted By | Kline, Bob (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | QA Verified |
Resolved | 2023-12-26 17:54:28 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.373149 |
CBIIT monitoring is not sufficient to prevent long-running outages of the production CDR, as evidenced by yesterday's disastrous failure. Create a new scheduled job to detect problems on that system and report them via email.
Implemented and installed on the DEV server. Checks the health of CDR PROD once a minute and sends out an alert via email to the registered recipients (myself for now) in the event of failure. If the failure persists, followup email messages are sent once an hour.
Elapsed: 0:00:00.001801