Issue Number | 3370 |
---|---|
Summary | [Media] Modify Vendor Filters to Process Audio Files |
Created | 2011-05-23 11:05:13 |
Issue Type | Improvement |
Submitted By | Englisch, Volker (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2011-07-27 18:09:53 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107698 |
BZISSUE::5063
BZDATETIME::2011-05-23 11:05:13
BZCREATOR::Volker Englisch
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku
We need to modify the vendor filters for GlossaryTerm and DIS to include the MediaLink element.
Also, due to our decision to modify the MediaLink element to include the (mime-)type attribute - in order to distinguish between image and audio files - vendor filters containing this MediaLink element will also need to be updated.
BZDATETIME::2011-05-25 15:29:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1
The following filters have been updated on MAHLER in order to produce
the MediaLink for audio files:
CDR616048 - Vendor Filter: GlossaryTermName
CDR271370 - Module: Vendor Filter Templates
The DTD has been modified (OCECDR-3371) to include a "type"
attribute.
a) Sample for Audio
<MediaLink ref="CDR0000696822" type="audio/mpeg"
alt="Alt-text" language="es" id="_4"/>
b) Sample for Image
<MediaLink ref="CDR0000428405" type="image/jpeg"
alt="Alt-text" language="en" thumb="Yes" id="_3">
<Caption language="en">...</Caption>
</MediaLink>
BZDATETIME::2011-05-31 17:16:35
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2
Email from Blair regarding the implemented changes
----------------------------------------------------------------------
From: Learn, Blair (NIH/NCI) [C]
Sent: Wednesday, May 25, 2011 1:25 PM
To: Englisch, Volker (NIH/NCI) [C]; Prasad, Betnag (NIH/NCI) [C]; Kline,
Robert (NCI)
Cc: Luke, Emile (NIH/NCI) [C]
Subject: RE: PDQ DTD R10093
I spoke with Volker earlier about how Media documents are referenced from DrugInformationSummary documents. What I understood from our conversation is that the plan is for GateKeeper to retrieve the MediaLink used by the GlossaryTerm document referenced from within the DrugInfoMetaData element.
This is a problem in that GateKeeper only loads one document at a time and therefore doesn't have access to the contents of other documents. (It's similar to the problem we had with the previous plan for MediaLink to not include the type attribute.) GateKeeper's use of references to other documents (e.g. the TerminologyLink) is presently limited to creating links to other pages and using the CDR id as an argument.
Volker suggested that it might be possible to resolve this on the CDR side by adding a filter in the post-processing to add a MediaLink to the DrugInfoMetaData. I believe that would solve it for GateKeeper.
BZDATETIME::2011-05-31 17:23:00
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3
Due to the difficulties that the proposed DTD would have presented to the Gatekeeper processing we've decided to denormalize the MediaLink information for the vendors and create a new element named PronunciationInfo within the meta data block. This PronunciationInfo contains the TermPronunciation and the audio MediaLinks.
These changes have been implemented and tested on MAHLER.
BZDATETIME::2011-06-06 13:03:33
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::4
Volker, is there anything we need to do to test this before it can be promoted or are you doing all of the testing?
BZDATETIME::2011-06-06 13:12:41
BZCOMMENTOR::Volker Englisch
BZCOMMENT::5
We won't be able to move these filters in production until Cancer.gov
is able to process the output.
I've just refreshed FRANCK to run a publishing job with the old
(current) filters to identify if the audio files could be loaded into
the CDR without affecting publishing. That way we could start loading
audio files now rather than having to wait until Cancer.gov is ready for
the new data.
Right now I am doing all of the testing but I'm guessing that you would be able to preview the changes on the preview site once Cancer.gov is getting closer to roll out the changes.
BZDATETIME::2011-06-07 18:34:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6
We have refreshed the CDR database on FRANCK (using a backup from
BACH).
I ran a publishing job for
DrugInfoSummary
GlossaryTerm
Media
Summary
Terminology
After the publishing jobs finished, Bob loaded all available audio
files to the CDR on FRANCK and I ran the same publishing jobs in order
to run a before/after diff.
All of the diffs between the publishing jobs of the individual doc types
came out without changes with one exception. The exception is that there
were 8 glossary documents missing from the later run.
Bob: Would this be related to the fact that you had reverted an earlier run?
BZDATETIME::2011-06-07 18:35:47
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7
I think I should add Bob to this issue so he can answer my question in the last comment.
BZDATETIME::2011-06-07 22:45:37
BZCOMMENTOR::Bob Kline
BZCOMMENT::8
(In reply to comment #6)
> Bob: Would this be related to the fact that you had reverted an earlier run?
Don't think so. Which documents were missing?
BZDATETIME::2011-06-08 10:01:12
BZCOMMENTOR::Volker Englisch
BZCOMMENT::9
(In reply to comment #8)
> Don't think so. Which documents were missing?
Only in Job8760: CDR44220.xml
Only in Job8760: CDR44229.xml
Only in Job8760: CDR44244.xml
Only in Job8760: CDR44286.xml
Only in Job8760: CDR44301.xml
Only in Job8760: CDR44338.xml
Only in Job8760: CDR44428.xml
Only in Job8760: CDR44441.xml
You're probably right. I see for all of these documents exists a
newer publishable version which should have been picked up for
publishing but didn't.
I'll have a look at the selection criteria.
BZDATETIME::2011-06-08 13:34:17
BZCOMMENTOR::Volker Englisch
BZCOMMENT::10
(In reply to comment #8)
> Don't think so. Which documents were missing?
Only in Job8760: CDR44220.xml
Only in Job8760: CDR44229.xml
Only in Job8760: CDR44244.xml
Only in Job8760: CDR44286.xml
Only in Job8760: CDR44301.xml
Only in Job8760: CDR44338.xml
Only in Job8760: CDR44428.xml
Only in Job8760: CDR44441.xml
You're probably right. I see for all of these documents exists a
newer publishable version which should have been picked up for
publishing but didn't.
I'll have a look at the selection criteria.
I reran the publishing job for these and it turns out that they are linking to non-existing Media documents. That's why they haven't been created.
BZDATETIME::2011-06-08 13:35:58
BZCOMMENTOR::Volker Englisch
BZCOMMENT::11
Seems the new Bugzilla has a new way to deal with mid-air collisions and I clicked the button I thought would throw away the second-to-last comment but guess what? :-)
BZDATETIME::2011-07-17 23:35:09
BZCOMMENTOR::Volker Englisch
BZCOMMENT::12
The following filters have been copied to BACH:
CDR0000271370.xml - R10121: Module: Vendor Filter Templates
CDR0000505580.xml - R10121: Module: Vendor Filter: DrugInfoSummary
CDR0000415359.xml - R10118: DocTitle for Media
CDR0000616047.xml - R10121: Denormalization Filter:
GlossaryTermName
CDR0000616048.xml - R10121: Vendor Filter: GlossaryTermName
CDR0000486313.xml - R10121: Denormalization Filter:
DrugInfoSummary
CDR0000617324.xml - R10121: Denormalization Filter: GlossaryTermName -
MediaLink
BZDATETIME::2011-07-20 18:41:15
BZCOMMENTOR::Volker Englisch
BZCOMMENT::13
All of the changes are in production since Sunday night.
After we've gone through Friday's regular weekly publishing job without problems we should probably be able to close these issues.
BZDATETIME::2011-07-27 18:09:53
BZCOMMENTOR::Volker Englisch
BZCOMMENT::14
We encountered no problems during last weeks publishing jobs.
Closing issue.
Elapsed: 0:00:00.001362