Issue Number | 4441 |
---|---|
Summary | [System] Move Hoover Config File to DB |
Created | 2018-03-15 12:50:27 |
Issue Type | Improvement |
Submitted By | Englisch, Volker (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2018-03-16 23:06:25 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.222694 |
The Filesweeper config file (a.k.a. Hoover) should be
imported into the CDR database so that we'll be able to make release
independent changes to this file similar to our publishing control
files.
This will require the creation of a new document type and schema.
Created new schema.
https://github.com/NCIOCPL/cdr-server/commit/34263953
Eliminated misleading and/or redundant comments and moved comment to end of block.
https://github.com/NCIOCPL/cdr-scheduler/commit/d47f5e3
Modified scheduled task to pull configuration from the CDR.
~volker: As I'm preparing
to create the new document type, I'm smoothing out some rough edges I'm
finding along the way. One of those rough edges is the fact (which I
noticed a good while back, but never got around to fixing) that the
document type editing interface (as well as the API) ignored the
title_filter
column of the doc_type
table. So
you couldn't really create a new document type, or change the title
filter for an existing document type, without some direct SQL. I'm
fixing that, and I'm tempted to think it would be OK for the picklist of
filters which could be selected for the title filter column to only have
filters whose titles start with "DocTitle for ...." This is true for all
of the filters plugged into the title_filter
column of the
doc_type
table, and it would certainly make the picklist
less unwieldy. Any objections?
No objections.
Fixed the SQL queries to find config document version.
https://github.com/NCIOCPL/cdr-scheduler/commit/901b4cc
Created tool for storing file sweeper configuration document.
https://github.com/NCIOCPL/cdr-tools/commit/613817c
Fixed document type editor software to include title filter.
https://github.com/NCIOCPL/cdr-admin/commit/edc9a8d4
https://github.com/NCIOCPL/cdr-lib/commit/ec4044b
~volker: This is ready for you to test on DEV. The config file is still maintained at tasks/FileSweeper.cfg in the cdr-scheduler repository (and for now, the software looks in d:/cdr/Scheduler/tasks/ for that file as a fallback if it can't find the document in the CDR). The tool for storing changes to the config document is at https://github.com/NCIOCPL/cdr-tools/blob/hawking/DevTools/Utilities/StoreSweepSpecs.py (run it with --help to see the invocation syntax).
The first link to the new schema isn't working.
Hadn't been pushed yet. Works now.
I'm not totally clear on how this is supposed to work.
When the FileSweeper runs it will load the config file from
the DB unless that doesn't exist in which case the config file is ready
from it's regular location, right?
The scheduler specifies the path to the FileSweeper.cfg file as
a parameter. Wouldn't that mean we're not using the DB version at
all?
Belt and suspenders. If all goes well, Hoover will get the configuration from the CDR repository. If something goes wrong, it will fall back on the old method of finding out what to do. I don't have it set up so that you can tell the software which method to use. The path parameter is just there to support the fallback. Eventually, after the new method has been working correctly in production, we can remove the parameter and fallback code completely. Make sense?
I see. This means the DB content will be used if it is available. I
thought the DB content would possibly overwrite anything read from the
CDR.
This also means, if I forget to store the updates using your new tool
StoreSweepSpecs.py Hoover will run with the old
configuration.
Mental note: Don't forget to store the updated file in the DB.
I'm guessing we're not using the following files anymore:
LogLoop.off
LogLoop.on
VerifyPushJobsLoop.on
If this is correct I will remove those files.
Correct. Now that we have the CDR Scheduler interface we no longer need that mechanism.
~bkline, what's your plan
for the new session log files? Keep them, delete them, archive
them after what time, on which tier?
Same question for dblogger and https_api-logger_ files?
We can archive the session files after, say, three months (all tiers). Not too worried about the other two as they don't take up much space. The session log is mirrored in the DB, so we may eventually need to have CBIIT apply the same truncating scheduled jobs as we used for the debug_log and the command_log tables.
The tools to store the config file in the database has been tested successfully. The only note I have is that Alan must have ignored comments but the schema only accepts comments at the end of a SweepSpec, so there were several warning when I tried to store the config file. After moving the comments to where the schema expects them the file was stored in the database without warnings or errors.
I updated the config file FileSweeper.cfg and successfully ran a Hoover run on DEV.
I've updated and installed the FileSweeper.cfg in order to remove the file(s) FtpExportData.txt. This worked OK on DEV. However, on QA I'm getting a permission denied error and I'm not seeing right away what permission I will need to replace the config file for Hoover on QA.
Do you know, ~bkline?
Standard stuff. You need to be in a group that can add and modify documents of this type. You're in such a group now. Give it another shot.
a group that can add and modify documents of this type
I've looked at the groups but nothing stood out to me that might be
related to the config file.
What type is this type?
I still don't have the answer I was looking for but I can talk to you about it tomorrow. I was trying to figure out what group I need to belong to in order to have permissions to update this document (type).
Anyone in the Developers
group has all the necessary
permissions for that document type on QA. You're in the
Developers
group.
Creating the document type and adding the permissions will be a manual step for the deployment to STAGE and PROD.
If I understand you correctly the reason why I wasn't able to "update" the config file was that the document type itself didn't exist yet. I thought that I didn't have permissions to update this particular document type. This makes sense to me now.
I did update the FileSweeper.cfg file and the nightly Hoover job did remove the files I added (FtpExportData.txt) and I am now able to update the config file.
If I understand you correctly the reason why I wasn't able to "update" the config file was that the document type itself didn't exist yet.
Correct.
I thought that I didn't have permissions to update this particular document type. ...
Well, you didn't. But that's only because you can't assign permissions to modify documents whose type hasn't yet been defined. :-)
I did update the FileSweeper.cfg file and the nightly Hoover job did remove the files I added (FtpExportData.txt) and I am now able to update the config file.
Excellent!
I confirmed that the FileSweeper.cfg config file is now part
of the CDR as a document type of SweepSpecifications.
Closing ticket.
File Name | Posted | User |
---|---|---|
Screen Shot 2018-04-11 at 10.36.14 AM.png | 2018-04-11 10:38:01 | Kline, Bob (NIH/NCI) [C] |
Elapsed: 0:00:00.000797