Issue Number | 4114 |
---|---|
Summary | [Python] Upgrade Python on CDR Windows servers |
Created | 2016-05-27 07:04:31 |
Issue Type | Improvement |
Submitted By | Kline, Bob (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2016-12-08 16:15:36 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.185076 |
We are currently running Python 2.7.2
2.7.2 (default, Jun 24 2011, 12:22:14) [MSC v.1500 64 bit (AMD64)]
The version currently available from ActiveState is 2.7.10.
Use of the older version is causing security warnings when running the package installer. See https://urllib3.readthedocs.org/en/latest/security.html for more information. I'm giving this an elevated priority, since it involves security.
I have attached a script that can be run to install most of the third-party modules we need using pip. The following modules can't be installed by pip (at least not on servers without compilers):
ndscheduler
PIL
pychecker (I'm inclined to skip this and use pylint instead)
MySQL-python
pycrypto
I'll come up with an up-to-date set of instructions for installing these and attach it here.
I've assigned this ticket to Volker, as he's shepherding the related issue https://tracker.nci.nih.gov/browse/WEBTEAM-9279.
The log file has been copied to
D:\CDR\logs
There were a couple of warnings and two packages couldn't be installed because those need to be build: pylint and sqlalchemy
Sorry, you must have missed my note to Seth, telling him I'll have the instructions ready tomorrow. I just posted a replaced version of the script which forces pre-built binaries.
I have a suspicion that pycrypto isn't used any more (I see indications on the web that it's a dead package), and as I noted above we can probably abandon pychecker (I've never liked it much, as i runs your code, which is sometimes dangerous; pylint is better). And PIL (which is also looking pretty dead) can I think be replaced by pillow. That leaves ndscheduler and MySQL-python, which I think we have instructions for somewhere.
I've never used pychecker myself except when Alan asked me to test something. I have no problem to remove those other packages you mentioned.
No, I didn't miss your note to Seth but I missed that you created an updated run-pip.bat. I'll wait until tomorrow.
Replacing PIL with pillow requires modifications to the import
statements which use that module. Oddly, the fork (named "pillow"
instead of "PIL"), requires from PIL import Image
or
from PIL import ImageEnhance
where the original package
(somewhat sloppily) exposed Image
and
ImageEnhance
as global names. Here are the four files which
need the modification:
./Inetpub/wwwroot/cgi-bin/cdr/GetCdrImage.py
./Inetpub/wwwroot/cgi-bin/cdr/ResizeImage.py
./Inetpub/wwwroot/cgi-bin/cdr/TestPythonUpgrade.py
./Mailers/cdrlatexlib.py
I think pycrypto
was formerly used to meet a dependency
in paramiko
for which pip
now uses the
cryptography
package.
I have attached the installer file for the MySQL-python
package (pip
doesn't have it); it can be run directly.
I have also attached ndscheduler.tar.bz2
which need to
be unpacked and installed in a command window as follows:
:
D
cd \tmp.tar.bz2
tar xjf \original\location\ndescheduler
cd ndscheduler.py install python setup
CBIIT ticket to perform the parts of this for which the development team has insufficient permissions.
There's a fifth script which needs modification (the behavior of the cStringIO module has changed):
Index: DownloadCTGovProtocols.py===================================================================
--- DownloadCTGovProtocols.py (revision 14236)
+++ DownloadCTGovProtocols.py (working copy)
@@ -73,6 +73,8 @@
root = etree.XML(rows[0][0].encode("utf-8"))
self.transform = etree.XSLT(root)
def normalize(self, doc):+ if isinstance(doc, unicode):
+ doc = doc.encode("utf-8")
fp = cStringIO.StringIO(doc)
tree = etree.parse(fp) return etree.tostring(self.transform(tree))
I have packaged up the post-processing steps as much as I believe is possible. The steps will now be:
Back up D:\Python
Uninstall the existing Python and remove D:\Python
Run
nciis-p401.nci.nih.gov\Group03\OCPL\OCPL_Cross\CDR\CdrBuild\ActivePython-2.7.10.12-win64-x64.msi
and install to D:\Python
Make everything at and under D:\Python world-readable
Run
nciis-p401.nci.nih.gov\Group03\OCPL\OCPL_Cross\CDR\CdrBuild\Scripts\python-upgrade-postprocess.bat
We will be able to run step 5 ourselves on QA. CBIIT will need to run them all on STAGE and PROD. The last step takes care of replacing the five files which needed to be modified (see above – I've created a branch for these), installing all of the third-party Python packages, and registering the COM ADO libraries. I've tested that last script pretty thoroughly on on one of my own systems, and I'm reasonably confident it will work on QA and the upper tiers. Still waiting on CBIIT to fix the permissions on DEV (I don't won't to start down time on QA before DEV is working again). Seems we're down to one response per day again on that ticket.
I have removed one more obstacle from the build process. I was trying to figure out a way for the MySQL-python installer to run synchronously without a GUI interface (I had needed to put that step at the very end because the batch file would keep going while the GUI installer for the prebuilt binary launched asyncrhonously). It didn't look as if there was any way to provide a command-line option to control the installer's behavior, so I decided to create my own wheel, so we could install with pip. I succeeded, and the result is in the CdrBuild directory on the L: drive. Wasn't easy, but an extra benefit was that I was able to upgrade from 1.2.3 to 1.2.5 (the latest version, if you don't count the Debian/Ubuntu fork).
Here's what I needed to do in order to build the wheel:
Go to https://pypi.python.org/pypi/MySQL-python and download and unpack source for the latest version (MySQL-python-1.2.5.zip in this case)
Go to https://www.microsoft.com/en-us/download/details.aspx?id=44266 and download and install Microsoft's C++ compiler for Python 2.7.
Go to http://dev.mysql.com/downloads/connector/c/6.0.html and download and install the 64-bit MSI installer (mysql-connector-c-6.0.2-winx64.msi, not the one for VS2005). It's important to use version 6.0.2, because later versions are incompatible with expectations in Dustman's source code. I have filed a bug report for this incompatibility, which appears to be an oversight on the part of the MySQL devs
Open a console window in the directory created in step #1 (e.g., Downloads\MySQL-python-1.2.5)
Edit site.cfg and change
= C:\Program Files (x86)\MySQL\MySQL Connector C 6.0.2 connector
to
= C:\Program Files\MySQL\MySQL Connector C 6.0.2 connector
Run
pip install wheel
Run
.py bdist_wheel python setup
The wheel will be in the dist directory. You can install it with
cd dist-1.2.5-cp27-cp27m-win_amd64.whl pip install MySQL_python
I have copied the wheel I built, as well as the tools needed to build it (Dustman's 1.2.5 source code, the MySQL Connector C 6.0.2, and Microsoft's C++ compiler for Python 2.7) to the CdrBuild directory on the L: drive:
MySQL-python-1.2.5.zip
VCForPython27.msi
mysql-connector-c-6.0.2-winx64.msi
I created a wheel for ndscheduler. I think this makes the process more likely to succeed, and I know it removes dependencies on drive letter mapping for the network drive and speeds things up significantly. I put it on the L: drive in the CdrBuild director:
ndscheduler-0.1.1-py2-none-any.whl
Python and all third-party modules have been upgraded on all of the non-production CDR Windows servers. Volker and I have done some testing of publishing, the new scheduler, and some other behind-the-scenes functionality.
~oseipokuw and ~JutheR: Before we promote the upgrades to production, it would be prudent if you checked at least the most critical of your reports on DEV or QA.
Thanks,
Bob
I have checked the reports listed on the OCCM Board Managers page in the Admin menus (with the exception of the General Use reports) and didn't run into any problems. However, the PCIB stats report is showing the older version (prior to the enhancements made in OCECDR-4096)--I wasn't sure if that was to be expected.
Also, I only tested one Board member correspondence mailer. Bob, if you think I need to check all of the mailers, please let me know.
William, let me know if you need help testing the other reports. Thanks.
QA had the newer version of the PCIB stats report, but DEV had reverted back. I have restored the newer version on DEV. Testing any of the board member correspondence mailers should be sufficient.
Thanks.
The audio import report produced a python script error when I tried to load an existing file (1.Week_115.zip).
> -->
A problem occurred in a Python script.
D:\cdr\Log\tmpcdykix.html contains the description of this error.
The audio review report also produced the following error. However, these are all old existing files so I am not sure if they actually exist on the FTP server.
These are the files that are generating the error message:
Week_113.zip
Week_115.zip
"Error opening zipfile 'd:/cdr/Audio_from_CIPSFTP/Week_113.zip':<br /> Exception Type: <class 'zipfile.BadZipfile'></br /> Exception msg: File is not a zip file"
Was this on DEV?
Was this on DEV?
Reply
Yes.
You are seeing these errors because the files you're trying to download do not exist on the DEV FTP server.
Those two files are empty, hence the message saying that they're not zip files. So this would be unrelated to the Python upgrade. Thanks for checking, though.
The files do exist, but they're both zero bytes in length.
I am getting the following error while running the Bounced Emailers report
"502 - Web server received an invalid response while acting as a gateway or proxy server.
There is a problem with the page you are looking for, and it cannot be displayed. When the Web server (while acting as a gateway or proxy) contacted the upstream content server, it received an invalid response from the content server."
Oh, I see. I was looking on the Linux side thinking the program is trying to retrieve the files from the FTP server but the program accesses the files once they have already been downloaded to the CDR server.
That was the result of work on OCECDR-4107 for Einstein, for which I added the requirement for a valid CDR session on the Linux side, but neglected to take care of the comparable change on the Windows side. Fixed on DEV (but still broken on QA for the moment). Won't be a problem on the upper tiers, because none of the Einstein modifications will have been promoted yet.
Fixed on DEV (but still broken on QA for the moment).
Verified on DEV.
I am getting the following error message for the CTGovProtocols vs. Early EntryDate report
"Server Error
502 - Web server received an invalid response while acting as a gateway or proxy server.
There is a problem with the page you are looking for, and it cannot be displayed. When the Web server (while acting as a gateway or proxy) contacted the upstream content server, it received an invalid response from the content server."
This is a report you told us we could remove earlier this year. The script is gone, but the menu entry still needs to be dropped.
This is a report you told us we could remove earlier this year. The script is gone, but the menu entry still needs to be dropped.
Reply
That is right. Should I create a ticket to remove it from the menu or wait until all the protocol menus get removed eventually?
I am done testing. All the reports appear to work well.
I have dropped it from the menu in the Einstein branch (and on DEV).
This is in production.
Elapsed: 0:00:00.000568