PDQ Issues

Issue Number	4418
Summary	Reduce filtering memory usage
Created	2018-02-15 07:49:23
Issue Type	Improvement
Submitted By	Kline, Bob (NIH/NCI) [C]
Assigned To	Kline, Bob (NIH/NCI) [C]
Status	Closed
Resolved	2018-03-07 09:04:04
Resolution	Fixed
Path	/home/bkline/backups/jira/ocecdr/issue.221307

Description

Long-running jobs which filter many thousands of CDR documents cause memory usage to increase gradually, risking severe performance degradation or even crashes. Track down the cause of this memory usage increase and if possible remove it or mitigate it.

Comment entered 2018-02-26 12:45:06 by Kline, Bob (NIH/NCI) [C]

I tracked down the cause of the memory leak to a bug in the adodbapi package. Possible solutions:

go through all the code and manually close all cursor objects before they go out of scope
replace the adodbapi package with something else (probably pyodbc)

Comment entered 2018-03-07 09:04:04 by Kline, Bob (NIH/NCI) [C]

I have replaced adodbapi with pyodbc which eliminated the memory leak and appears to have significant sped up some reports.

https://github.com/NCIOCPL/cdr-lib/commit/215bec3

Comment entered 2018-04-11 10:03:37 by Kline, Bob (NIH/NCI) [C]

This turns out to have been the right decision, to a greater degree than I anticipated. I had offered to assist the maintainer of the adodbapi package in getting some of the long-standing bugs taken care of, and although he was initially receptive, he appears to have gone into hibernation again. The pyodbc package is supported by Microsoft, which should make it less likely we're relying on a package which would eventually be abandoned (though you never really know with Microsoft).

Elapsed: 0:00:00.001446

CDR Tickets