Issue Number | 3226 |
---|---|
Summary | [Summary] document version with select markup for translators? |
Created | 2010-09-16 11:41:48 |
Issue Type | Improvement |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Englisch, Volker (NIH/NCI) [C] |
Status | Closed |
Resolved | 2010-10-04 16:24:48 |
Resolution | Won't Fix |
Path | /home/bkline/backups/jira/ocecdr/issue.107554 |
BZISSUE::4914
BZDATETIME::2010-09-16 11:41:48
BZCREATOR::William Osei-Poku
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku
Discussion points:
1. When the English Summary has been revised and ready to be
published, a final markup version is created and saved for Spanish
translation. This version could have different revision levels markup
(approved, proposed and publish). Spanish translators prefer to see only
the markup they will be translating, which is usually one or two of
them.
We will like to discuss ways of allowing a user to see only one revision
level markup.
2. Other times, more than one 'final markup' is created and this
creates difficulties for the translators as they will have to consult
more than one version of the document to complete the translation.
We will also like to discuss the possibilities of making it easier for
the user to review more than one final markup.
BZDATETIME::2010-09-17 09:39:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1
(In reply to comment #0)
> Spanish translators prefer to see only the markup they will be
translating,
> which is usually one or two of them.
I am still trying to understand how the result of this request is
different from properly using the redline/strikeout QC reports?
Currently, we are using the three levels of markup within our QC
reports:
publish, approved, and proposed
"Publish" markup is displayed with a brown font and some brownish
background.
"Approved" markup is displayed in red.
"Proposed" markup is displayed in green.
When the RS QC report is run the user has the option to select which
markup level is being marked up and which level of markup is displayed
by applying two rules:
a) The markup level is marked up in the text if it has been
checked
b) A markup levels "higher" in the hierarchy is being applied and
markup
levels "lower" in the hierarchy are being ignored.
As an example, if "publish" has been selected for markup this is the
level
that is marked up in the text and the lower levels (approved and
proposed)
are ignored.
However, if "proposed" has been selected for markup the proposed text
will
be marked up and the higher levels (publish and approved) will be
applied.
I am wondering if this request by the translation team would be satisfied if the QC report, instead of applying higher level markup, would drop it in from the output?
BZDATETIME::2010-09-17 09:44:37
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2
I'm attaching the revision markup table that lists how to filters are handling the markup in the QC reports. I'm not sure if William has seen this.
Please note that we have changed the color purple to green a while back because purple and red were not easily distinguishable on a printout.
Attachment RevisionLevel_MarkUp.xls has been added with description: Revision Markup Table
BZDATETIME::2010-09-17 17:25:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::3
Here's another approach that may make the life of the translators easier:
http://bach.nci.nih.gov/QcReportRSWithJS.html
I've added JavaScript to this QC with redline-strikeout page which allows the user to press the j (as in "jump") key to jump to the next Insertion or Deletion segment. In all but the buggiest browsers (are any of our users still using IE6?) the segment will be highlighted with a yellow background. Of course, JavaScript has to be enabled for this to work.
The technique could be even more effective if it weren't for the fact that the editors of the English version of the summary sometimes use redundant Insertion and Deletion elements. Occasionally there's an Insertion or Deletion element with absolutely nothing in it. I've made the code skip over those (though they're included in the count that appears when you hover over an insertion or deletion segment with your mouse), so that's not as much of a problem. More frequently, however, I see redundant nested markup. For example, at the very top of the document:
<Deletion ...><Deletion
...>(11/18/2009)</Deletion></Deletion>
<Insertion ...><Insertion
...>(08/29/2010)</Insertion></Insertion>
When that happens (and it happens a lot) you'll end up with the same segment highlighted twice in a row.
Also, I can imagine that if I were one of the translators, I'd be happier if the editors of the English version applied the markup with a little less granularity. So, instead of
<Insertion>The</Insertion><Deletion>A</Deletion> <Insertion>quick</Insertion><Deletion>peppy</Deletion> <Insertion>brown</Insertion></Deletion>russet</Deletion> <Insertion>fox</Insertion><Deletion>marmoset</Deletion> <Insertion>jumped</Insertion><Deletion>leaped</Deletion> <Insertion>over</Insertion><Deletion>above</Deletion> <Insertion>the</Insertion><Deletion>some</Deletion> <Insertion>lazy</Insertion><Deletion>indolent</Deletion> <Insertion>dog</Insertion><Deletion>cat</Deletion><Insertion>.</Insertion><Deletion>!</Deletion>
... I might hope for
<Deletion>A peppy ... cat!</Deletion><Insertion>The quick ... dog.</Insertion>
I'm exaggerating, but I think you'll appreciate my point if you walk through the entire sample document trying to put yourself in the shoes of the translator.
BZDATETIME::2010-09-21 11:18:59
BZCOMMENTOR::Volker Englisch
BZCOMMENT::4
(In reply to comment #3)
> The technique could be even more effective if it weren't for the
fact that the
> editors of the English version of the summary sometimes use
redundant
> Insertion and Deletion elements.
Looking at the filters it appears that we can't fault the editors for
the redundant Insertion/Deletion elements. In fact, the filters that are
part of the QC Insertion/Deletion Set are creating these additional
elements.
I believe the reason for this is that we want to move Insertion/Deletion
tags which are around an entire paragraph, for instance, so that they
only appear around text nodes like this:
Original:
---------
<Deletion>
<Para>This is a paragraph with a <GlossaryTerm>Stage
I</GlossaryTerm>
glossary term.</Para>
</Deletion>
Converted to:
-------------
<Deletion>
<Para><Deletion>This is a paragraph with a
</Deletion>
<GlossaryTerm><Deletion>Stage
I</Deletion></GlossaryTerm>
<Deletion>glossary term.</Deletion></Para>
</Deletion>
At a later stage, the outside deletions (or insertions) are removed
with a subsequent filter but this process only appears to be removing
the markup if the child is not a deletion itself and
this process is leaving the constructs you've noticed like the
following:
<Deletion>
<Deletion>Deleted Text</Deletion>
</Deletion>
If we're going with your approach, Bob, we can create another filter to clean up these double-markup elements which are probably left since they haven't had any negative effect until now.
BZDATETIME::2010-09-21 14:37:35
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::5
(In reply to comment #1)
> (In reply to comment #0)
> > Spanish translators prefer to see only the markup they will be
translating,
> > which is usually one or two of them.
>
> I am still trying to understand how the result of this request is
different
> from properly using the redline/strikeout QC reports?
To the translators, knowing which elements they are working with is very important. While the QC reports are good, when translators tile two XMetal documents, for example, and go from section to section, and from tag to tag, it makes it easier to identify exactly what needs to be translated. This is why our first request was for an enhancement in XMetal as opposed to the QC reports. However, since that is impossible, a QC report that also shows elements and possibly identifies elements with attributes will be helpful.
BZDATETIME::2010-09-22 17:03:29
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6
(In reply to comment #5)
> While the QC reports are good, when translators tile two
XMetal
> documents, for example, and go from section to section, and from
tag to tag,
> it makes it easier to identify exactly what needs to be
translated.
Another idea along these lines of showing the tags as part of a QC report could possibly be to apply CSS to XML output. This needs to be explored further but it would be possible to apply CSS to individual element like we're doing within XMetaL but to do it outside of XMetaL.
I've marked up a document as a test on MAHLER:
file:///M:/home/venglisch/CDR/Filters/62969.xml
(sorry William, I don't think you'll be able to preview this at
CIAT)
We would be able to suppress information that would not need to be looked at by the translators like BoardMember or Date elements.
BZDATETIME::2010-09-22 17:25:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7
I forgot to mention, the document is best viewed with FireFox or Chrome.
BZDATETIME::2010-09-24 11:12:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::8
(In reply to comment #4)
> In fact, the filters that are part of
> the QC Insertion/Deletion Set are creating these additional
elements.
> I believe the reason for this is that we want to move
Insertion/Deletion tags
> which are around an entire paragraph, for instance, so that they
only appear
> around text nodes
Because I need to make some changes for OCECDR-3231 (Fix Summary QC
Reports) I have to touch the filter set which is wrapping the
Insertion/Deletion markup around text nodes and I have fixed this
problem along the way.
There was a bug in the filter that removed the "outside" markup elements
if other child elements existed that were not Insertion
or Deletion elements. As mentioned earlier, this logic failed when there
were only Insertion or Deletion elements within
Insertion/Deletion elements.
The change to the filter
CDR315892 - Clean up Insertion and Deletion
will be implemented along with the other changes for OCECDR-3231.
BZDATETIME::2010-09-24 11:20:40
BZCOMMENTOR::Volker Englisch
BZCOMMENT::9
William, copy this file to your local drive, then open it with FireFox or Chrome.
Attachment 62969.xml has been added with description: XML File with CSS
BZDATETIME::2010-09-24 11:25:57
BZCOMMENTOR::Volker Englisch
BZCOMMENT::10
William, copy this file to the same directory as the XML file you just copied.
Attachment xmlstyle.css has been added with description: XML CSS File
BZDATETIME::2010-09-28 11:37:17
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::11
I downloaded the files and installed them on translators’ computers. I also installed Firefox on their computers but after carefully testing this new implementation and comparing with translating from XMetal, the translators concluded that XMetal offers them a better option for translating documents than the report. I have also explained to the translators that modification to XMetal to accommodate their needs isn’t an option.
BZDATETIME::2010-09-29 14:13:53
BZCOMMENTOR::Volker Englisch
BZCOMMENT::12
(In reply to comment #11)
> ... the translators concluded that XMetal offers them a better
option
> for translating documents than the report.
I'm not sure where this leaves us with this task. All three options provided here aren't working for the translators and I'm assuming that creating a new HTML QC report won't help either.
What are we going to do now?
BZDATETIME::2010-09-29 14:27:03
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::13
(In reply to comment #12)
> What are we going to do now?
I think it is OK to end the search here. I've explained all the limitations to users.
BZDATETIME::2010-10-04 16:22:57
BZCOMMENTOR::Volker Englisch
BZCOMMENT::14
As discussed at last weeks status meeting this issue will not be addressed further.
BZDATETIME::2010-10-04 16:24:48
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::15
Issue closed. Thank you!
File Name | Posted | User |
---|---|---|
62969.xml | 2010-09-24 11:20:40 | Englisch, Volker (NIH/NCI) [C] |
RevisionLevel_MarkUp.xls | 2010-09-17 09:44:37 | Englisch, Volker (NIH/NCI) [C] |
xmlstyle.css | 2010-09-24 11:25:57 | Englisch, Volker (NIH/NCI) [C] |
Elapsed: 0:00:00.000602