Issue Number | 4581 |
---|---|
Summary | File sweeper reports failure when calculating file size change |
Created | 2019-02-26 08:24:27 |
Issue Type | Bug |
Submitted By | Kline, Bob (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2019-03-12 18:41:29 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.240744 |
The CDR FileSweeper failed on DEV at Tue Feb 26 01:15:04 2019.
Error message was:
FATAL error: File "scheduler-service.log": Truncated and remaining sizes 100001101 + 567703517 != original size 667703517
It's not clear whether this represents a failure in the file manipulation or in the logic which determines whether the manipulation was performed correctly. Determine which is the problem and fix it.
It appears there is another problem which may be related. An archive file of the log file scheduler-service.log is being created every single day for the last 10 months but the log file itself doesn't get truncated. It's resulting in the log file being over 650MB in size at the moment.
I updated the FileSweeper.cfg file to exclude the scheduler-service.log file from being truncated as part of the Hoover job. The system is trying to archive/truncate the file as part of the nightly job but the scheduler is running and locking the file from being modified. This resulted in an archive file being created every night although the file never shrunk and grew to 650MB on DEV (700+MB on QA). The proper way would be to stop the scheduler service before running the Hoover job.
I copied the large log file to a file named
scheduler-service-offline.log
and included this file in the FileSweeper.cfg steps.
I ran into an error trying to update the *.cfg file in the CDR database.
C:\cygwin64\home\volker\temp>python ..\CDR\git\cdr-tools\DevTools\Utilities\StoreSweepSpecs.py --tier DEV --comment "my comment" --session <session> FileSweeper.cfg
Traceback (most recent call last):
File "..\CDR\git\cdr-tools\DevTools\Utilities\StoreSweepSpecs.py", line 51, in
<module>
save_opts["doc"] = str(doc)
File "C:\CDR\lib\Python\cdr.py", line 1295, in __str__
xml = self.xml.decode("utf-8")
AttributeError: 'str' object has no attribute 'decode'
and fixed this with the following modification in cdr.py:
< # Python 3 compatible
< if is_python3:
< xml = self.xml
< else:
< xml = self.xml.decode("utf-8")
<
---
> xml = self.xml.decode("utf-8")
I assume the diff
output is reversed. However, I'd be
more inclined to do something like this:
= self.xml
xml if not isinstance(xml, unicode):
= xml.decode("utf-8") xml
Oh, I see! You already made this change on Feb 2. I can ignore my changes and copy your version from the python3 branch.
Here are the highlights of the changes I've made to the FileSweeper.cfg file:
Excluding the scheduler-service.log file from the nightly Hoover job and archiving instead an off-line copy
Including the FileSweeper.log file in the config file but also archiving an off-line version
Including the glossifier.log file
Removing SweepSpecs for CTGov log files
Adjusting SweepSpecs where appropriate
I've created a batch file and submitted a service ticket for CBIIT to
stop the CDR Scheduler service
copy and truncate the scheduler-service.log file and
start the CDR Scheduler service again on both, STAGE and PROD
The batch file is located in
V:\hoover-20190309\cdr-hotfix-20190308.bat
Elapsed: 0:00:00.001270