Issue Number | 5131 |
---|---|
Summary | Update to Python 3rd party modules in Ohm broke audio file upload automation |
Created | 2022-08-17 12:24:04 |
Issue Type | Bug |
Submitted By | Dugan, Amy (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2022-08-19 09:43:35 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.325368 |
Automated audio file importing As part of the Python upgrade for the Ohm release, the third-party modules were updated to their current versions. The package used for reading spreadsheets for the GlossaryTermAudioReview.py script was apparently recently modified to remove support for reading .xlsx workbooks and now that package only reads the legacy .xls format (this package's specialty is handling Microsoft's original format more completely than any other package, so that's the direction they're going in).
This issue has caused the glossary and Spanish teams to revert to a manual process for upload and review. The added work (for English) is about 10 hours for 30 terms (this assumes being able to work uninterrupted for 10 hours on just the 30 terms). The Spanish LOE appears to be higher. We expect to take up to a month or two for the current batch of 72 terms, as the team fits this manual work into other work priorities. Thus fixing this issue is a high priority.
Fixed on DEV.
Hi ~bkline Was the above info intended for OCECDR-5132? If it was meant for this ticket (OCECDR-5131) then please more clarification is needed to help with testing. Thanks!
Sorry, moved the comment to the right ticket.
Thanks, Bob! I assume the fix allows the program to continue to recognize .xlsx files as it did previously, right? I am trying to find out which file extensions we should be testing with on DEV.
Right, the audio software uses modern Excel files to track the pronunciations. The legacy Excel format is unsupported. Since the initial step in the process is the creation of a new Excel workbook by the software, someone would have to deliberately choose to wander from the default path and explicitly save the workbook as the older format to end up with a problem, I think we should be OK.
Thanks for the clarification. Please upload the attached zip file to the DEV server for testing. It is the same file Amy sent you to upload to PROD. Week_2022_30.zip
Amy has already tested that batch on DEV.
OK. That's good to know. We will do more testing with this file and probably another file too. Thanks!
If you want to do end-to-end testing, you'll need to craft a custom subset of the batch. Otherwise, at a certain point in the pipeline, the job will fail when it hits a document which is on PROD, but not on the tier on which you're testing.
Done.
Getting "No zip files found to be transferred" on QA.
Oops, put it on the wrong tier. Please try again.
Thanks! It worked!
Verified on QA. Thanks!
File Name | Posted | User |
---|---|---|
Week_2022_30.zip | 2022-08-29 17:37:01 | Osei-Poku, William (NIH/NCI) [C] |
Week_2022_36.zip | 2022-09-06 13:34:36 | Osei-Poku, William (NIH/NCI) [C] |
Elapsed: 0:00:00.002074