CDR Tickets

Issue Number 3226
Summary [Summary] document version with select markup for translators?
Created 2010-09-16 11:41:48
Issue Type Improvement
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2010-10-04 16:24:48
Resolution Won't Fix
Path /home/bkline/backups/jira/ocecdr/issue.107554
Description

BZISSUE::4914
BZDATETIME::2010-09-16 11:41:48
BZCREATOR::William Osei-Poku
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku

Discussion points:

1. When the English Summary has been revised and ready to be published, a final markup version is created and saved for Spanish translation. This version could have different revision levels markup (approved, proposed and publish). Spanish translators prefer to see only the markup they will be translating, which is usually one or two of them.
We will like to discuss ways of allowing a user to see only one revision level markup.

2. Other times, more than one 'final markup' is created and this creates difficulties for the translators as they will have to consult more than one version of the document to complete the translation.
We will also like to discuss the possibilities of making it easier for the user to review more than one final markup.

Comment entered 2010-09-17 09:39:22 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-17 09:39:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1

(In reply to comment #0)
> Spanish translators prefer to see only the markup they will be translating,
> which is usually one or two of them.

I am still trying to understand how the result of this request is different from properly using the redline/strikeout QC reports?
Currently, we are using the three levels of markup within our QC reports:
publish, approved, and proposed
"Publish" markup is displayed with a brown font and some brownish background.
"Approved" markup is displayed in red.
"Proposed" markup is displayed in green.

When the RS QC report is run the user has the option to select which markup level is being marked up and which level of markup is displayed by applying two rules:
a) The markup level is marked up in the text if it has been checked
b) A markup levels "higher" in the hierarchy is being applied and markup
levels "lower" in the hierarchy are being ignored.
As an example, if "publish" has been selected for markup this is the level
that is marked up in the text and the lower levels (approved and proposed)
are ignored.
However, if "proposed" has been selected for markup the proposed text will
be marked up and the higher levels (publish and approved) will be applied.

I am wondering if this request by the translation team would be satisfied if the QC report, instead of applying higher level markup, would drop it in from the output?

Comment entered 2010-09-17 09:44:37 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-17 09:44:37
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2

I'm attaching the revision markup table that lists how to filters are handling the markup in the QC reports. I'm not sure if William has seen this.

Please note that we have changed the color purple to green a while back because purple and red were not easily distinguishable on a printout.

Comment entered 2010-09-17 09:44:37 by Englisch, Volker (NIH/NCI) [C]

Attachment RevisionLevel_MarkUp.xls has been added with description: Revision Markup Table

Comment entered 2010-09-17 17:25:07 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-09-17 17:25:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::3

Here's another approach that may make the life of the translators easier:

http://bach.nci.nih.gov/QcReportRSWithJS.html

I've added JavaScript to this QC with redline-strikeout page which allows the user to press the j (as in "jump") key to jump to the next Insertion or Deletion segment. In all but the buggiest browsers (are any of our users still using IE6?) the segment will be highlighted with a yellow background. Of course, JavaScript has to be enabled for this to work.

The technique could be even more effective if it weren't for the fact that the editors of the English version of the summary sometimes use redundant Insertion and Deletion elements. Occasionally there's an Insertion or Deletion element with absolutely nothing in it. I've made the code skip over those (though they're included in the count that appears when you hover over an insertion or deletion segment with your mouse), so that's not as much of a problem. More frequently, however, I see redundant nested markup. For example, at the very top of the document:

<Deletion ...><Deletion ...>(11/18/2009)</Deletion></Deletion>
<Insertion ...><Insertion ...>(08/29/2010)</Insertion></Insertion>

When that happens (and it happens a lot) you'll end up with the same segment highlighted twice in a row.

Also, I can imagine that if I were one of the translators, I'd be happier if the editors of the English version applied the markup with a little less granularity. So, instead of

<Insertion>The</Insertion><Deletion>A</Deletion> <Insertion>quick</Insertion><Deletion>peppy</Deletion> <Insertion>brown</Insertion></Deletion>russet</Deletion> <Insertion>fox</Insertion><Deletion>marmoset</Deletion> <Insertion>jumped</Insertion><Deletion>leaped</Deletion> <Insertion>over</Insertion><Deletion>above</Deletion> <Insertion>the</Insertion><Deletion>some</Deletion> <Insertion>lazy</Insertion><Deletion>indolent</Deletion> <Insertion>dog</Insertion><Deletion>cat</Deletion><Insertion>.</Insertion><Deletion>!</Deletion>

... I might hope for

<Deletion>A peppy ... cat!</Deletion><Insertion>The quick ... dog.</Insertion>

I'm exaggerating, but I think you'll appreciate my point if you walk through the entire sample document trying to put yourself in the shoes of the translator.

Comment entered 2010-09-21 11:18:59 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-21 11:18:59
BZCOMMENTOR::Volker Englisch
BZCOMMENT::4

(In reply to comment #3)
> The technique could be even more effective if it weren't for the fact that the
> editors of the English version of the summary sometimes use redundant
> Insertion and Deletion elements.

Looking at the filters it appears that we can't fault the editors for the redundant Insertion/Deletion elements. In fact, the filters that are part of the QC Insertion/Deletion Set are creating these additional elements.
I believe the reason for this is that we want to move Insertion/Deletion tags which are around an entire paragraph, for instance, so that they only appear around text nodes like this:
Original:
---------
<Deletion>
<Para>This is a paragraph with a <GlossaryTerm>Stage I</GlossaryTerm>
glossary term.</Para>
</Deletion>

Converted to:
-------------
<Deletion>
<Para><Deletion>This is a paragraph with a </Deletion>
<GlossaryTerm><Deletion>Stage I</Deletion></GlossaryTerm>
<Deletion>glossary term.</Deletion></Para>
</Deletion>

At a later stage, the outside deletions (or insertions) are removed with a subsequent filter but this process only appears to be removing the markup if the child is not a deletion itself and this process is leaving the constructs you've noticed like the following:
<Deletion>
<Deletion>Deleted Text</Deletion>
</Deletion>

If we're going with your approach, Bob, we can create another filter to clean up these double-markup elements which are probably left since they haven't had any negative effect until now.

Comment entered 2010-09-21 14:37:35 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-09-21 14:37:35
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::5

(In reply to comment #1)
> (In reply to comment #0)
> > Spanish translators prefer to see only the markup they will be translating,
> > which is usually one or two of them.
>
> I am still trying to understand how the result of this request is different
> from properly using the redline/strikeout QC reports?

To the translators, knowing which elements they are working with is very important. While the QC reports are good, when translators tile two XMetal documents, for example, and go from section to section, and from tag to tag, it makes it easier to identify exactly what needs to be translated. This is why our first request was for an enhancement in XMetal as opposed to the QC reports. However, since that is impossible, a QC report that also shows elements and possibly identifies elements with attributes will be helpful.

Comment entered 2010-09-22 17:03:29 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-22 17:03:29
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6

(In reply to comment #5)
> While the QC reports are good, when translators tile two XMetal
> documents, for example, and go from section to section, and from tag to tag,
> it makes it easier to identify exactly what needs to be translated.

Another idea along these lines of showing the tags as part of a QC report could possibly be to apply CSS to XML output. This needs to be explored further but it would be possible to apply CSS to individual element like we're doing within XMetaL but to do it outside of XMetaL.

I've marked up a document as a test on MAHLER:
file:///M:/home/venglisch/CDR/Filters/62969.xml
(sorry William, I don't think you'll be able to preview this at CIAT)

We would be able to suppress information that would not need to be looked at by the translators like BoardMember or Date elements.

Comment entered 2010-09-22 17:25:22 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-22 17:25:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7

I forgot to mention, the document is best viewed with FireFox or Chrome.

Comment entered 2010-09-24 11:12:43 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-24 11:12:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::8

(In reply to comment #4)
> In fact, the filters that are part of
> the QC Insertion/Deletion Set are creating these additional elements.
> I believe the reason for this is that we want to move Insertion/Deletion tags
> which are around an entire paragraph, for instance, so that they only appear
> around text nodes

Because I need to make some changes for OCECDR-3231 (Fix Summary QC Reports) I have to touch the filter set which is wrapping the Insertion/Deletion markup around text nodes and I have fixed this problem along the way.
There was a bug in the filter that removed the "outside" markup elements if other child elements existed that were not Insertion or Deletion elements. As mentioned earlier, this logic failed when there were only Insertion or Deletion elements within Insertion/Deletion elements.
The change to the filter
CDR315892 - Clean up Insertion and Deletion

will be implemented along with the other changes for OCECDR-3231.

Comment entered 2010-09-24 11:20:40 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-24 11:20:40
BZCOMMENTOR::Volker Englisch
BZCOMMENT::9

William, copy this file to your local drive, then open it with FireFox or Chrome.

Comment entered 2010-09-24 11:20:40 by Englisch, Volker (NIH/NCI) [C]

Attachment 62969.xml has been added with description: XML File with CSS

Comment entered 2010-09-24 11:25:57 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-24 11:25:57
BZCOMMENTOR::Volker Englisch
BZCOMMENT::10

William, copy this file to the same directory as the XML file you just copied.

Comment entered 2010-09-24 11:25:57 by Englisch, Volker (NIH/NCI) [C]

Attachment xmlstyle.css has been added with description: XML CSS File

Comment entered 2010-09-28 11:37:17 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-09-28 11:37:17
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::11

I downloaded the files and installed them on translators’ computers. I also installed Firefox on their computers but after carefully testing this new implementation and comparing with translating from XMetal, the translators concluded that XMetal offers them a better option for translating documents than the report. I have also explained to the translators that modification to XMetal to accommodate their needs isn’t an option.

Comment entered 2010-09-29 14:13:53 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-09-29 14:13:53
BZCOMMENTOR::Volker Englisch
BZCOMMENT::12

(In reply to comment #11)
> ... the translators concluded that XMetal offers them a better option
> for translating documents than the report.

I'm not sure where this leaves us with this task. All three options provided here aren't working for the translators and I'm assuming that creating a new HTML QC report won't help either.

What are we going to do now?

Comment entered 2010-09-29 14:27:03 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-09-29 14:27:03
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::13

(In reply to comment #12)

> What are we going to do now?

I think it is OK to end the search here. I've explained all the limitations to users.

Comment entered 2010-10-04 16:22:57 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-10-04 16:22:57
BZCOMMENTOR::Volker Englisch
BZCOMMENT::14

As discussed at last weeks status meeting this issue will not be addressed further.

Comment entered 2010-10-04 16:24:48 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-10-04 16:24:48
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::15

Issue closed. Thank you!

Attachments
File Name Posted User
62969.xml 2010-09-24 11:20:40 Englisch, Volker (NIH/NCI) [C]
RevisionLevel_MarkUp.xls 2010-09-17 09:44:37 Englisch, Volker (NIH/NCI) [C]
xmlstyle.css 2010-09-24 11:25:57 Englisch, Volker (NIH/NCI) [C]

Elapsed: 0:00:00.000602