CDR Tickets

Issue Number 3358
Summary [Media] Modify Publishing Software to Process Audio Files
Created 2011-05-11 14:21:42
Issue Type Improvement
Submitted By Englisch, Volker (NIH/NCI) [C]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2011-09-01 17:58:55
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107686
Description

BZISSUE::5051
BZDATETIME::2011-05-11 14:21:42
BZCREATOR::Volker Englisch
BZASSIGNEE::Volker Englisch
BZQACONTACT::Bob Kline

The publishing software is currently only handling image data (jpeg and gif files) and needs to be expanded to also publish audio files.

A a side note I'm thinking that instead of continuing to create a single Media manifest file we may want to create two individual files, one for images and one for audio files.
I have the feeling that our vendors will be asking for something like this once we're including the audio data.

Note: This issue is currently assigned to myself as the default component
assignee but feel free to raise your hand if one of you likes to take
over (or not).

Comment entered 2011-05-11 14:25:00 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-05-11 14:25:00
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1

(In reply to comment #0)
> I have the feeling that our vendors will be asking for something like this once
> we're including the audio data.

Strike that! The file name extension should be a clear giveaway.

Comment entered 2011-05-18 12:57:01 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-05-18 12:57:01
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2

The WCM development team is asking to receive test documents via the Web service.
I'm therefore bumping up the priority for this issue so that they won't have to wait for us.

Comment entered 2011-05-18 14:41:19 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-05-18 14:41:19
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3

I've modified the following programs
cdr.py
cdrpub.py
to allow publication of MP3 files.

A first audio file has been published.

Bob, if you're entering the CDR-ID for a meeting recording for a hot-fix our system is happily publishing this document. The document will not be published as part of a regular publishing job because the select statement in the publishing document will exclude those files.

Do we need to prevent documents that are being marked with the attribute of Usage=Internal from being published at all?

Comment entered 2011-05-19 13:43:02 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2011-05-19 13:43:02
BZCOMMENTOR::Bob Kline
BZCOMMENT::4

(In reply to comment #3)

> Do we need to prevent documents that are being marked with the attribute of
> Usage=Internal from being published at all?

Yes.

Comment entered 2011-05-20 13:24:51 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-05-20 13:24:51
BZCOMMENTOR::Volker Englisch
BZCOMMENT::5

Bob, the code for rejecting documents that shouldn't be published (i.e. documents without a publishable version) is in the server code and I believe that's where this test regarding falsely hot-fixing internal audio documents belongs as well.

What's your feeling? Should we include this Media test in the server code or the publishing code?

Comment entered 2011-05-20 17:35:23 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2011-05-20 17:35:23
BZCOMMENTOR::Bob Kline
BZCOMMENT::6

(In reply to comment #5)
> Bob, the code for rejecting documents that shouldn't be published (i.e.
> documents without a publishable version) is in the server code and I believe
> that's where this test regarding falsely hot-fixing internal audio documents
> belongs as well.
>
> What's your feeling? Should we include this Media test in the server code or
> the publishing code?

That's a tough question. We've resisted embedding content-aware logic inside the server, though we've given in for a couple of things. Let's talk about it when Alan returns. I don't think it's so urgent that it can't wait.

Comment entered 2011-06-10 16:04:00 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-06-10 16:04:00
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7

As discussed in our status meeting we will not modify the server to prevent these hot-fixed audio files from being published.

Comment entered 2011-06-14 14:23:10 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-06-14 14:23:10
BZCOMMENTOR::Volker Englisch
BZCOMMENT::8

Since there are many parts that need to be completed to include the audio files I am using this issue to create a (hopefully) complete list of those bugs that are part of the project.

[ ] OCECDR-3237 (BK): [Glossary Audio] Adding Audio Pronunciations to Glossary
Documents
[ ] OCECDR-3327 (AM): [Glossary Audio] Web Interface for reviewing audio
pronunciations
[X] OCECDR-3356 (VE): [Media] Update Publishing Document for Audio Files
[ ] OCECDR-3357 (VE): [Media] Submit Change Notification to PQD Vendors
[X] OCECDR-3358 (VE): [Media] Modify Publishing Software to Process Audio Files
(Ready except for media hot-fix modifications)
[X] OCECDR-3362 (VE): [Media] Change to DocTitle Filters
[ ] OCECDR-3363 (VE): [Media] Template for Audio Pronunciation Documents
[ ] OCECDR-3364 (BK): [Media] Adding ProcessingStatusValues for use with Audio
Pronunciations?
[X] OCECDR-3370 (VE): [Media] Modify Vendor Filters to Process Audio Files
[ ] OCECDR-3371 (VE): [Media] Modify Vendor DTD to Process Audio Files
[ ] OCECDR-3373 (BK): Automate process of creating audio pronunciation media
documents

Comment entered 2011-07-01 18:22:56 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-07-01 18:22:56
BZCOMMENTOR::Volker Englisch
BZCOMMENT::9

For my information:
These are the things that need to be done for the audio/RelatedInfo data to go live.

Migrate Filters:
----------------
CDR0000271370.xml - Module: Vendor Filter Templates
CDR0000505580.xml - Module: Vendor Filter: DrugInfoSummary
CDR0000415359.xml - DocTitle for Media
CDR0000616047.xml - Denormalization Filter: GlossaryTermName
CDR0000616048.xml - Vendor Filter: GlossaryTermName
CDR0000486313.xml - Denormalization Filter: DrugInfoSummary
CDR0000617324.xml - Denormalization Filter: GlossaryTermName - MediaLink

Python scripts
--------------
cdr.py
cdrpub.py

Publishing document
-------------------
M: \home\venglisch\temp\178_m.xml

DTD
---
pdq.dtd
pdqCG.dtd

Comment entered 2011-07-14 11:44:16 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-07-14 11:44:16
BZCOMMENTOR::Volker Englisch
BZCOMMENT::10

Updating the Media related tasks of this check list:

[ ] OCECDR-3237 (BK): [Media] Adding Audio Pronunciations to Glossary
Documents
*[X] OCECDR-3327 (AM): [Glossary Audio] Web Interface for reviewing audio
pronunciations
[X] OCECDR-3356 (VE): [Media] Update Publishing Document for Audio Files
[X] OCECDR-3357 (VE): [Media] Submit Change Notification to PQD Vendors
[X] OCECDR-3358 (VE): [Media] Modify Publishing Software to Process Audio Files
(Ready except for media hot-fix modifications)
[X] OCECDR-3362 (VE): [Media] Change to DocTitle Filters
*[X] OCECDR-3363 (VE): [Media] Template for Audio Pronunciation Documents
[X] OCECDR-3364 (BK): [Media] Adding ProcessingStatusValues for use with Audio
Pronunciations?
[X] OCECDR-3370 (VE): [Media] Modify Vendor Filters to Process Audio Files
[X] OCECDR-3371 (VE): [Media] Modify Vendor DTD to Process Audio Files
[ ] OCECDR-3373 (BK): Automate process of creating audio pronunciation media
documents
[X] OCECDR-3390 (VE): [Media] Audio player JS files not accessible in CDR
Publish-Preview

Also in 6.2:
[X] OCECDR-3376 (VE): Modify DTD and Vendor Filters to Include Related
Information

Comment entered 2011-07-17 23:31:41 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-07-17 23:31:41
BZCOMMENTOR::Volker Englisch
BZCOMMENT::11

The following programs have been copied to BACH:
cdr.py - R10119
cdrpub.py - R10122

Comment entered 2011-07-27 18:09:18 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-07-27 18:09:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::12

We still need to update the publishing software to prevent meeting recording audio files to be published when these are submitted as hot-fixes.

Comment entered 2011-08-01 13:23:21 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-08-01 13:23:21
BZCOMMENTOR::Volker Englisch
BZCOMMENT::13

The program has been modified to double-check if a document ID entered to be hot-fixed belongs to a media document with Usage='Internal'. It this is the case the publishing process will be interrupted for the operator to remove the document.
Publishing.py - R10150

This is ready for review on MAHLER.

Comment entered 2011-08-01 13:41:27 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2011-08-01 13:41:27
BZCOMMENTOR::Bob Kline
BZCOMMENT::14

(In reply to comment #13)
> The program has been modified to double-check if a document ID entered to be
> hot-fixed belongs to a media document with Usage='Internal'. It this is the
> case the publishing process will be interrupted for the operator to remove the
> document.
> Publishing.py - R10150
>
> This is ready for review on MAHLER.

Let's do a code walk-through tomorrow.

Comment entered 2011-08-01 14:01:01 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-08-01 14:01:01
BZCOMMENTOR::Volker Englisch
BZCOMMENT::15

Sure, although the changes were very small and localized:

I added a function named isMeetingRecording():

#----------------------------------------------------------------------

  1. Testing if the document to be hot-fixed is a meeting recording doc.
    #----------------------------------------------------------------------
    def isMeetingRecording(id):
    try:
    conn = cdrdb.connect()
    cursor = conn.cursor()
    except cdrdb.Error, info:
    reason = "Failure: %s" % info[1][0]
    cdr.logwrite("Cdr connection failed in isMeetingRecording(). \
    %s" % reason)

cursor.execute("""
select d.id
from document d
join query_term_pub q
on q.doc_id = d.id
and q.path = '/Media/@Usage'
where d.id = ?
and q.value = 'Internal' – Meeting Recording
""", id)
row = cursor.fetchone()

if row:
return True
return False

and I am calling this function to identify if the given CDR-ID is allowed to be included in the list of IDs to be hot-fixed. If an invalid document has been identified I'm displaying an error message. This approach has the disadvantage that the operator would only receive a message for the first error encountered in case there exist multiple documents with a problem. This is, however, the same way that we are handling the problem of entering documents for which a publishable version doesn't exist. In addition, since there generally only very few documents to be hot-fixed this approach shouldn't put a big burden on the operator.
Here's the snippet that's calling the function isMeetingRecording():

for i in range(len(docIds)):

  1. At the moment the media documents include image files,

  2. audio pronunciation files, and meeting recordings. The

  3. meeting recordings (MR) are excluded from regular

  4. publishing but the hot-fix publishing would allow a

  5. document to be published if the ID gets entered manually.

  6. We're checking the IDs here to prevent this.

  7. ---------------------------------------------------------
    if isMeetingRecording(int(docIds[i])):
    cdr.logwrite("Error: Internal document detected - \
    %s" % docIds[i])
    cdrcgi.bail("Error: Unable to publish Meeting \
    Recordings (usage='internal') - \
    CDR%s" % docIds[i])
    docIds[i] = 'CDR' + str(int(docIds[i]))

Comment entered 2011-08-09 17:31:23 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-08-09 17:31:23
BZCOMMENTOR::Volker Englisch
BZCOMMENT::16

While I was preparing to print out the changes for a code review I noticed a bug in my code which led to some more changes.
I need to do a little more testing before we can have a look at the new changes.

Comment entered 2011-09-01 17:29:54 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-01 17:29:54
BZCOMMENTOR::Volker Englisch
BZCOMMENT::17

I've included some additional changes after a code review with Bob. The changes have been implemented and tested successfully on MAHLER:
publishing.py - R10188

Comment entered 2011-09-01 17:33:13 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2011-09-01 17:33:13
BZCOMMENTOR::Bob Kline
BZCOMMENT::18

Go ahead and promote the changes.

Comment entered 2011-09-01 17:49:10 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2011-09-01 17:49:10
BZCOMMENTOR::Volker Englisch
BZCOMMENT::19

The changes have been copied to FRANCK and BACH.

Comment entered 2011-09-01 17:58:55 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2011-09-01 17:58:55
BZCOMMENTOR::Bob Kline
BZCOMMENT::20

All done.

Elapsed: 0:00:00.000727