CDR Tickets

Issue Number 4441
Summary [System] Move Hoover Config File to DB
Created 2018-03-15 12:50:27
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2018-03-16 23:06:25
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.222694
Description

The Filesweeper config file (a.k.a. Hoover) should be imported into the CDR database so that we'll be able to make release independent changes to this file similar to our publishing control files.
This will require the creation of a new document type and schema.

Comment entered 2018-03-16 14:16:36 by Kline, Bob (NIH/NCI) [C]

Created new schema.

https://github.com/NCIOCPL/cdr-server/commit/34263953

Eliminated misleading and/or redundant comments and moved comment to end of block.

https://github.com/NCIOCPL/cdr-scheduler/commit/d47f5e3

Modified scheduled task to pull configuration from the CDR.

https://github.com/NCIOCPL/cdr-scheduler/commit/095add4

Comment entered 2018-03-16 16:45:03 by Kline, Bob (NIH/NCI) [C]

: As I'm preparing to create the new document type, I'm smoothing out some rough edges I'm finding along the way. One of those rough edges is the fact (which I noticed a good while back, but never got around to fixing) that the document type editing interface (as well as the API) ignored the title_filter column of the doc_type table. So you couldn't really create a new document type, or change the title filter for an existing document type, without some direct SQL. I'm fixing that, and I'm tempted to think it would be OK for the picklist of filters which could be selected for the title filter column to only have filters whose titles start with "DocTitle for ...." This is true for all of the filters plugged into the title_filter column of the doc_type table, and it would certainly make the picklist less unwieldy. Any objections?

Comment entered 2018-03-16 17:28:58 by Englisch, Volker (NIH/NCI) [C]

No objections.

Comment entered 2018-03-16 23:00:33 by Kline, Bob (NIH/NCI) [C]

Fixed the SQL queries to find config document version.

https://github.com/NCIOCPL/cdr-scheduler/commit/901b4cc

Created tool for storing file sweeper configuration document.

https://github.com/NCIOCPL/cdr-tools/commit/613817c

Fixed document type editor software to include title filter.

https://github.com/NCIOCPL/cdr-admin/commit/edc9a8d4
https://github.com/NCIOCPL/cdr-lib/commit/ec4044b

Comment entered 2018-03-16 23:06:25 by Kline, Bob (NIH/NCI) [C]

: This is ready for you to test on DEV. The config file is still maintained at tasks/FileSweeper.cfg in the cdr-scheduler repository (and for now, the software looks in d:/cdr/Scheduler/tasks/ for that file as a fallback if it can't find the document in the CDR). The tool for storing changes to the config document is at https://github.com/NCIOCPL/cdr-tools/blob/hawking/DevTools/Utilities/StoreSweepSpecs.py (run it with --help to see the invocation syntax).

Comment entered 2018-03-22 14:50:19 by Englisch, Volker (NIH/NCI) [C]

The first link to the new schema isn't working.

Comment entered 2018-03-22 15:09:57 by Kline, Bob (NIH/NCI) [C]

Hadn't been pushed yet. Works now.

Comment entered 2018-03-22 15:19:19 by Englisch, Volker (NIH/NCI) [C]

I'm not totally clear on how this is supposed to work.

When the FileSweeper runs it will load the config file from the DB unless that doesn't exist in which case the config file is ready from it's regular location, right?
The scheduler specifies the path to the FileSweeper.cfg file as a parameter. Wouldn't that mean we're not using the DB version at all?

Comment entered 2018-03-22 15:24:39 by Kline, Bob (NIH/NCI) [C]

Belt and suspenders. If all goes well, Hoover will get the configuration from the CDR repository. If something goes wrong, it will fall back on the old method of finding out what to do. I don't have it set up so that you can tell the software which method to use. The path parameter is just there to support the fallback. Eventually, after the new method has been working correctly in production, we can remove the parameter and fallback code completely. Make sense?

Comment entered 2018-03-22 15:30:45 by Englisch, Volker (NIH/NCI) [C]

I see. This means the DB content will be used if it is available. I thought the DB content would possibly overwrite anything read from the CDR.
This also means, if I forget to store the updates using your new tool StoreSweepSpecs.py Hoover will run with the old configuration.

Mental note: Don't forget to store the updated file in the DB.

Comment entered 2018-03-22 15:38:21 by Englisch, Volker (NIH/NCI) [C]

I'm guessing we're not using the following files anymore:

  • LogLoop.off

  • LogLoop.on

  • VerifyPushJobsLoop.on

If this is correct I will remove those files.

Comment entered 2018-03-22 16:25:31 by Kline, Bob (NIH/NCI) [C]

Correct. Now that we have the CDR Scheduler interface we no longer need that mechanism.

Comment entered 2018-03-23 12:03:06 by Englisch, Volker (NIH/NCI) [C]

, what's your plan for the new session log files? Keep them, delete them, archive them after what time, on which tier?
Same question for dblogger and https_api-logger_ files?

Comment entered 2018-03-23 12:49:17 by Kline, Bob (NIH/NCI) [C]

We can archive the session files after, say, three months (all tiers). Not too worried about the other two as they don't take up much space. The session log is mirrored in the DB, so we may eventually need to have CBIIT apply the same truncating scheduled jobs as we used for the debug_log and the command_log tables.

Comment entered 2018-03-23 16:56:54 by Englisch, Volker (NIH/NCI) [C]

The tools to store the config file in the database has been tested successfully. The only note I have is that Alan must have ignored comments but the schema only accepts comments at the end of a SweepSpec, so there were several warning when I tried to store the config file. After moving the comments to where the schema expects them the file was stored in the database without warnings or errors.

I updated the config file FileSweeper.cfg and successfully ran a Hoover run on DEV.

Comment entered 2018-04-10 18:55:03 by Englisch, Volker (NIH/NCI) [C]

I've updated and installed the FileSweeper.cfg in order to remove the file(s) FtpExportData.txt. This worked OK on DEV. However, on QA I'm getting a permission denied error and I'm not seeing right away what permission I will need to replace the config file for Hoover on QA.

Do you know, ?

Comment entered 2018-04-10 19:55:44 by Kline, Bob (NIH/NCI) [C]

Standard stuff. You need to be in a group that can add and modify documents of this type. You're in such a group now. Give it another shot.

Comment entered 2018-04-11 10:30:45 by Englisch, Volker (NIH/NCI) [C]

a group that can add and modify documents of this type

I've looked at the groups but nothing stood out to me that might be related to the config file.
What type is this type?

Comment entered 2018-04-11 10:39:44 by Kline, Bob (NIH/NCI) [C]

Comment entered 2018-04-11 10:49:25 by Englisch, Volker (NIH/NCI) [C]

I still don't have the answer I was looking for but I can talk to you about it tomorrow. I was trying to figure out what group I need to belong to in order to have permissions to update this document (type).

Comment entered 2018-04-11 10:56:01 by Kline, Bob (NIH/NCI) [C]

Anyone in the Developers group has all the necessary permissions for that document type on QA. You're in the Developers group.

Creating the document type and adding the permissions will be a manual step for the deployment to STAGE and PROD.

Comment entered 2018-04-11 12:02:58 by Englisch, Volker (NIH/NCI) [C]

If I understand you correctly the reason why I wasn't able to "update" the config file was that the document type itself didn't exist yet. I thought that I didn't have permissions to update this particular document type. This makes sense to me now.

I did update the FileSweeper.cfg file and the nightly Hoover job did remove the files I added (FtpExportData.txt) and I am now able to update the config file.

Comment entered 2018-04-11 12:11:23 by Kline, Bob (NIH/NCI) [C]

If I understand you correctly the reason why I wasn't able to "update" the config file was that the document type itself didn't exist yet.

Correct.

I thought that I didn't have permissions to update this particular document type. ...

Well, you didn't. But that's only because you can't assign permissions to modify documents whose type hasn't yet been defined. :-)

I did update the FileSweeper.cfg file and the nightly Hoover job did remove the files I added (FtpExportData.txt) and I am now able to update the config file.

Excellent!

Comment entered 2018-05-10 19:08:38 by Englisch, Volker (NIH/NCI) [C]

I confirmed that the FileSweeper.cfg config file is now part of the CDR as a document type of SweepSpecifications.
Closing ticket.

Attachments
File Name Posted User
Screen Shot 2018-04-11 at 10.36.14 AM.png 2018-04-11 10:38:01 Kline, Bob (NIH/NCI) [C]

Elapsed: 0:00:00.000797