Issue Number | 4129 |
---|---|
Summary | EOL for Python 2.7 |
Created | 2016-06-30 10:59:37 |
Issue Type | Improvement |
Submitted By | Kline, Bob (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2019-11-13 17:43:29 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.187173 |
Python 2.7 support will end January 1, 2020. This is a separate ticket from OCECDR-4114 because upgrading to Python 3.x is harder but less urgent. "Utilities" is the best I could find for a component.
We should start this no later than the beginning of November, if not sooner.
Time for an update 🙂Â
😛
I think we're going to make it. 😃
What's in it for me? Anything you want me to look at?
Ah, your plate's not full? If so, let me take a look at my task list. I had assumed between keeping the CDR home fires burning, ramping up on Drupal, and the vendor and subsidiary site support they had pile on you, you'd be swamped. If that's not true, the first little thing it would be good for you to tackle would be to back out the obsolete document types (and their schemas) which had been purged recently, but somehow managed to be dragged back from the grave on DEV by the dev-data restoration script.
(Update: doesn't look like there are document types for these, so you just need to mark the schemas as deleted.)
Â
ID |
DOCTYPE |
USER |
SAVED |
TITLE |
---|---|---|---|---|
799139 |
schema |
volker |
2019-08-30 13:34:23 |
EmailerDocument.xml |
799138 |
schema |
volker |
2019-08-30 13:34:21 |
SubmittedTrial.xml |
799137 |
schema |
volker |
2019-08-30 13:34:19 |
HereditaryCancerSyndrome.xml |
799136 |
schema |
volker |
2019-08-30 13:34:17 |
EligibilityCriterion.xml |
799135 |
schema |
volker |
2019-08-30 13:34:14 |
EmailerManifest.xml |
799134 |
schema |
volker |
2019-08-30 13:34:12 |
EmailerRecipient.xml |
799133 |
schema |
volker |
2019-08-30 13:34:09 |
GP.xml |
Â
(Here's a cool tip. You may have know this already, but that table was created by running a query in the CDR ad-hoc query interface (New Docs, a query I just created and saved), selecting some rows (including the headers), pasting into the Visual view of this comment editor, and using the toolbar commands to eliminate the two rows I didn't need. Jira has added some nifty functionality recently! 🙂)
After driving the wooden stake into those retired document types/schemas, the next task to tackle, if you should choose to accept your mission, would be to try and swap out the database layer currently used by the scheduler (pymssql, based on the freetds project) and replace it with pyodbc. While you're in the scheduler, you could tackle the task of running the 2to3 upgrade tool on our own code in that repository. There are different ways of using that tool. One is to blindly run it in "make the changes" mode and then see if the software works under Python 3. I prefer to run it without the -w switch, making the recommended changes with which I agree by hand. Sometimes when there are lots of changes to be made to a file, I will use the -w flag, but redirect the output to a file which I can then bring up to review, backing out or modifying any changes I don't think were right. Examples of changes I won't use:
the addition of an unnecessary extra set of parentheses for
print()
(so print("foo %" % bar)
becomes
print(("foo %' % bar))
unnecessary wrapping of iterables with list
when I
know I'm using the iterable in a way which doesn't
need list()
replacing basestring
with str
, which
changes the semantics, and in some cases breaks code (the other two
examples are just ugly annoyances, but this one is dangerous)
And then your next task for this ticket would be to eliminate the
encoding declaration from the xml decl at the top of any documents
(filters, mostly) and from any of our code which puts it there. The
encoding="utf-8"
part of
<?xml version="1.0" encoding="utf-8"?>
is redundant,
because that's the default encoding assumed if none is specified. And
for Python 3, the XML parsers (or at least lxml
, the one we
use) will generate a Unicode string with
etree.tostring(root, encoding="unicode")
. We will want to
move toward always manipulating string values as unecoded
str
objects, only encoding them at the moment they're being
serialize for export. The parser will balk at
etree.tostring(root, encoding="unicode")
, however, if the
document has an encoding of utf-8 attached to it, so we need to
eliminate that encoding declaration from our documents and code.
This doesn't apply to the charset="utf-8"
which we
still want in the meta
tag of HTML pages we
generate, of course.
Ah, your plate's not full? If so, let me take a look at my task list. I had assumed between keeping the CDR home fires burning, ramping up on Drupal, and the vendor and subsidiary site support they had pile on you, you'd be swamped.
You know how that works: You're swamped until the water recedes and you're waiting for the next Tsunami. It could roll in tomorrow or next Friday. Besides, I don't want to you create all the bugs yourself. 🙂
... tackle the task of running ...
Looks like I had already done that for the scheduler repo. Doesn't mean all the code will run correctly under Python 3, but it will when you're finished. 🙂
Here's another sticky issue with the scheduler running on Python 3. I
was working on the TestPythonUpgrade.py CGI smoke test, and it reported
that the apns package hadn't yet been installed. So I installed it using
pip (making sure I got the package we've been using with the existing
servers, as there's more than one APNs implementation floating around).
I couldn't import it with Python 3, as it still had syntax which only
works on Python 2.x (for example, except Exception, e
instead of except Exception as e
). I dug into the project's
issues, and found that this had been reported as a bug (more than
once: https://github.com/djacobs/PyAPNs/issues/163 and https://github.com/djacobs/PyAPNs/issues/177).
Apparently, the version on pypi is out of date, and no one seems
interested in addressing that problem. So in order to use this package
with Python 3 we would have to install it directly from GitHub instead
of using pip. Not an appealing path. Please investigate and determine
whether (a) this package really is needed by ndscheduler
and (b) whether it would be feasible to swap in one of the other apns
implementations which actually supports Python 3.
Digging a little further myself, I'm starting to think we don't
really need the apns
package. In https://github.com/NCIOCPL/cdr-scheduler/blob/master/requirements.txt I
see that this package is a
... dependencies for simple_scheduler only
and that's just a demo example we're not using. So I think we can
just drop apns
.
Are we creating separate sub-tasks/branches for these individual items or is this all going to by under OCECDR-4129?
There is another schema I've been thinking to get rid off which also
got re-instated by a former refresh: xxtest (doc_type = 40,
schema = 531742, title filter = 792271)
Can I just delete these documents (mark as deleted) from the document
table and set the row in the doc_type table as inactive?
I've been tracking the task for this ticket in a separate list. Here's what that list looks like right now.
 Create new tools to facilitate the upgrade work ✔
tool to find all invocations of a function, method, constructors, etc. (has side benefit of checking for syntax errors in all Python code) ✔
tool to find all imports of a module ✔
tool to report unused installed modules ✔
make the client APIs/libraries work on non-Windows machines ✔
make non-IIS server for testing on MacBooks ✔
enhance XML normalization tools to make document comparison more useful ✔
eliminate encoding statement from <?xml ... declarations ✔
Run 2to3 on all Python source code (has to be done with inspection, as the tool makes some mistakes) ✔
core API ✔
legacy libraries ✔
CGI scripts ✔
publishing scripts ✔
mailers ✔
glossifier ✔
licensee ✔
scheduler ✔
database ✔
filters ✔
api tunnel ✔
bin ✔
build ✔
dev tools ✔
utilities ✔
report bugs in 2to3 tool to Python core team ✔
Get the https tunneling working under Python 3 ✔
Eliminate extra database packages ✔
scheduler (switch to pyodbc) ✔
change %s placeholders to ? ✔
remove home-grown database exception classes ✔
move timeouts from execute() to connect() ✔
make all connect() arguments keyword args ✔
use context blocks ("with conn.cursor() ...:") whenever changing code
Colsolidate multiple functions for sending email ✔
create new class ✔
modify all calls ✔
test
Reduce logging to the class based on the standard library ✔
eliminate other logging classes/functions ✔
rewrite uses of those obsolete approaches ✔
Replace Page class so that it doesn't use a mix of Unicode and bytes
write new class ✔
test ✔
plug into advanced search page software ✔
add style rules to cdr.css for advanced pages ✔
update other uses (possibly spread over time, beyond release)
Fix handling of Unicode/bytes in communication with web server
always get incoming CGI parameters as strings, not bytes ✔
always work with strings, not bytes, during processing
use objects instead of strings while building HTML/XML
defer encoding of strings until they are going out the door
use urlencode()
 and parse_qs()
 for
query paramater strings
replace unicodeToLatin1()
 ✔
in general, follow https://docs.python.org/3/howto/unicode.html
rewrite cdrcgi.header()
✔
fix sending spreadsheets ✔
Replace deprecated cgi.escape() ✔
If you would feel more comfortable creating separate tickets for the individual tasks, feel free to do so.
I think that should do it.
so you just need to mark the schemas as deleted.)
Stupid question: This means active_status = 'D' and not active_status = 'I', right? I'm looking at it after the update and started wondering why I'm still seeing the records as part of the document view.
This means active_status = 'D' and not active_status = 'I', right?
Â
Right. "I" corresponds to "Inactive" (also referred to as "blocked" by the users).
~oseipokuw - Any reason
we shouldn't drop Protocols from the Reports menu? The
only item on it is the Warehouse Box Number Report, and that
will never have anything to report, now that the
InScopeProtocol
 documents have been removed.
Ready for UAT.
File Name | Posted | User |
---|---|---|
Screen Shot 2019-02-09 at 10.56.54 AM.png | 2019-02-09 10:58:46 | Kline, Bob (NIH/NCI) [C] |
Screen Shot 2019-09-13 at 14.28.53.png | 2019-09-13 14:29:51 | Englisch, Volker (NIH/NCI) [C] |
Screen Shot 2020-02-07 at 1.32.53 PM.png | 2020-02-07 13:47:23 | Englisch, Volker (NIH/NCI) [C] |
Elapsed: 0:00:00.001425