Issue Number | 3218 |
---|---|
Summary | [TMS] Connecting CDR to World Server |
Created | 2010-09-09 11:50:22 |
Issue Type | Improvement |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2013-03-12 12:02:29 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107546 |
BZISSUE::4906
BZDATETIME::2010-09-09 11:50:22
BZCREATOR::Robin Juthe
BZASSIGNEE::Bob Kline
BZQACONTACT::Margaret Beckwith
Setting up an umbrella issue for starting the process of integrating World Server SDL translation software with the CDR. The first task will be to acquire the design document for review.
Bob, I'm not sure if this is the most appropriate component, so please revise if need be.
BZDATETIME::2010-09-09 12:01:08
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::1
I edited the title since we are not really talking of true integration! Integration is a politically loaded word around here. Also I would recommend using TMS as the abbreviation - Translation Management System.
BZDATETIME::2010-09-30 10:51:43
BZCOMMENTOR::Bob Kline
BZCOMMENT::2
[From email message sent to Mauricio 2010-09-28]
Thanks, Marucio. I have reviewed the document you copied for me. It appears to be a proposal in response to an RFP, rather than a design document. I have a few questions:
1. Could I have a copy of WS.A40 (Technical Architecture and
Design)?
2. Are there really no deliverables for stage II (Solutions Design and
Build)?
3. What stage has the process reached?
4. How will the PDF review output fit into the translation
workflow?
5. Who will create the filters to generate that PDF?
6. Are you the project manager on the NCI side?
7. Who is the PM assigned by SDL?
8. Do you have a definition for the "Push-Poll" connector?
9. Where will the file system used by this connector be located?
My primary (only?) goal at this point is to learn the details of how markup in the original repository documents will be preserved when newly translated content is merged back into the repository.
Thanks!
BZDATETIME::2010-09-30 10:53:03
BZCOMMENTOR::Bob Kline
BZCOMMENT::3
[Email response from Mauricio 2010-09-28]
Hi Bob,
We have gotten that far yet. We are only at the data gathering process, pre-implementation. May be the document attached can answer some of your questions, but this is still a working document and may change a little bit. Right now, this is all the information we have.
Yes, I will be the PM for NCI and the main point of contact between NCI and the vendor. SDL has not assigned anybody yet because this process has not started.
Will keep you updated.
Thanks
Mauricio
BZDATETIME::2010-10-21 10:16:05
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::4
Added my name to the cc list
BZDATETIME::2011-04-08 12:45:55
BZCOMMENTOR::Bob Kline
BZCOMMENT::5
Margaret and I had a conference call with Pavel this morning to clarify how WorldServer's system works. It turns out that they don't expect us to give them the current version of the Spanish summary documents at all. The only thing they want is the current version of the English summary document, which they'll give back to us with the translatable parts translated, and which they expect us to drop in as a complete replacement version for the Spanish summary document, thus discarding any work the editors have ever done on that document outside the translation memory system.
The only possible use of the WorldServer system I can envision for our documents, given the limitations of that system, would be a process in which we give WS the latest translatable parts of the English document, the translators translate it in WS, and the results are manually pasted back into the Spanish document. It's hard to imagine that the manual reintegration effort would be made worthwhile by the translation work saved by WS.
If summary documents (not to mention the even more problematic structure of the glossary documents) were more like CTGovProtocol documents, in which we fence off most of the document as untouchable except by the import software, leaving a very tightly delimited sandbox in which PDQ editors can make changes which will be preserved by the import process, then we might want to consider a similar approach to this problem, but everything in the summary documents are fair game for editing, and translatable content is so tightly interwoven with non-translatable content that reintegration of a newly translated version into an existing document would be a dangerous nightmare.
We will meet with Mauricio to share our concerns.
BZDATETIME::2011-04-18 11:46:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::6
We met with Mauricio and gave him an earful. He's reluctant to give up, so we agreed to try a pilot project with just a few summaries so we can get empirical data on how difficult it would be for the translators to lose the ability to work with the Spanish summaries in the CDR. Using WorldServer for glossary document translation is off the table.
BZDATETIME::2011-04-26 10:11:20
BZCOMMENTOR::Bob Kline
BZCOMMENT::7
Mauricio will set up a meeting with the technical folks at SDL to discuss next steps.
BZDATETIME::2011-09-01 12:30:15
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::8
Just thought I would update this a bit:
1. May 25-26: SDL was onsite for a 2-day meeting to discuss
implementation for CDR, Percussion, and Adhoc documents. Outcome was to
break implementation into phases, where Phase I would end sometime in
Feb. 2012. For CDR, Phase I entails doing alignment on 10 documents and
authoring 2 new documents in Trados. Challenges with using the system to
update existing summaries were not worked out.
2. Aug. 24-25: Training for users and admin. was given on use of Trados
and Worldserver. The software has been installed on CIAT translator
computers.
3. Sept. 1: There will be a follow-up telephone call today to talk about
next steps.
BZDATETIME::2011-09-02 07:25:30
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::9
Linda would need the XML file for the Lung Cancer Screening HP summary - CDR 62832 to begin testing translations in SDL World Server and Trados next week or so. We also need to explore ways of getting the translated document back into the CDR.
BZDATETIME::2011-09-02 07:38:35
BZCOMMENTOR::Bob Kline
BZCOMMENT::10
Has the location where I'm supposed to place the file for pickup by Trados been established?
BZDATETIME::2011-09-06 09:26:57
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::11
(In reply to comment #10)
> Has the location where I'm supposed to place the file for pickup by
Trados been
> established?
Not yet. Until we decide on a place, you can attach it to this issue, since it is only one file, or you can place it on the ftp server and we will download it from there.
BZDATETIME::2011-09-06 09:32:48
BZCOMMENTOR::Bob Kline
BZCOMMENT::12
Attachment 62832.xml has been added with description: Document for Linda to translate in standalone mode
BZDATETIME::2011-09-22 09:45:25
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::13
Linda needs the xml file for the Lung Cancer Screening summary - 258019 - for testing.
BZDATETIME::2011-09-22 10:33:35
BZCOMMENTOR::Bob Kline
BZCOMMENT::14
(In reply to comment #13)
> Linda needs the xml file for the Lung Cancer Screening summary -
258019 - for
> testing.
Any particular version, or should I just give her the current working document?
BZDATETIME::2011-09-22 10:43:39
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::15
(In reply to comment #14)
> (In reply to comment #13)
> > Linda needs the xml file for the Lung Cancer Screening summary
- 258019 - for
> > testing.
>
> Any particular version, or should I just give her the current
working document?
She wants the Last Publishable Version.
BZDATETIME::2011-09-23 11:42:34
BZCOMMENTOR::Bob Kline
BZCOMMENT::16
Attachment 258019v46.xml has been added with description: Another standalone test document for Linda
BZDATETIME::2011-09-27 15:20:06
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::17
Linda has access to the "Translation Management System" folder on the
L:Drive. Path=
nci6116g.nci.nih.gov\group\OCE_CROSS\Translation Management System
Future requests for xml files should be placed in this folder.
BZDATETIME::2011-09-30 10:12:57
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::18
(In reply to comment #17)
> Linda has access to the "Translation Management System" folder on
the L:Drive.
> Path=
nci6116g.nci.nih.gov\group\OCE_CROSS\Translation Management System
>
> Future requests for xml files should be placed in this folder.
Linda needs the xml file for 62765. Please place it in the above folder. Please expedite this request if you can. She needs it by this afternoon.
BZDATETIME::2011-09-30 11:09:54
BZCOMMENTOR::Bob Kline
BZCOMMENT::19
I created a subfolder called Documents and dropped the XML file there.
BZDATETIME::2011-09-30 12:24:03
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::20
(In reply to comment #19)
> I created a subfolder called Documents and dropped the XML file
there.
Thank you. Linda is workding from home and for some reasons, she could not access the drive even though she logged in to VPN. Please attach the file to this issue.
BZDATETIME::2011-09-30 12:27:28
BZCOMMENTOR::Bob Kline
BZCOMMENT::21
Attachment CDR62765V165.xml has been added with description: Third test document
BZDATETIME::2011-09-30 12:28:58
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::22
(In reply to comment #21)
> Created attachment 2165 [details]
> Third test document
Thank you!
BZDATETIME::2011-10-06 11:55:42
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::23
We decided that instead of requesting individual summaries each time Linda needed one for testing, she would rather request between 5 to 10 of them at a time, which should be placed in the folder on the L drive and when she has exhausted them, she can request more.
BZDATETIME::2011-10-14 09:52:29
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::24
Linda needs the XML for 62751 for testing in WS. Please place it in the folder you created on the L drive.
BZDATETIME::2011-10-14 14:48:02
BZCOMMENTOR::Bob Kline
BZCOMMENT::25
(In reply to comment #24)
> Linda needs the XML for 62751 for testing in WS. Please place it in
the folder
> you created on the L drive.
Done.
BZDATETIME::2011-10-21 15:52:15
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::26
Here are the elements Linda wanted stripped from the XML files:
Comments
Media Links
Section Type Metadata
Replacement for
Changes to this summary section
PDQ Key
SummaryFrag Refs ?
DateLast Modified
ComprehensiveREviewDate
Also, Bob will be loading some xml documents from World Server onto Franck for Linda to review.
BZDATETIME::2011-10-24 13:07:45
BZCOMMENTOR::Bob Kline
BZCOMMENT::27
(In reply to comment #26)
> Bob will be loading some xml documents from World Server onto
Franck for
> Linda to review.
Why don't you look at them on Mahler before I create them on Franck, just in case changes to the software are needed.
http://mahler.nci.nih.gov/cgi-bin/cdr/show-cdr-doc.py?id=696910
http://mahler.nci.nih.gov/cgi-bin/cdr/show-cdr-doc.py?id=696911
(Or just bring up CDR606910 and CDR696911 in XMetaL.)
I noticed that the UTF-8 encoding was garbled for the trademark symbol in the SummaryUrl element for CDR696910. I checked the original English summary document, and the encoding is correct there, so it seems that the corruption occurred somewhere in the translation process, something which might need to be discussed with Trados.
BZDATETIME::2011-10-26 12:30:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::28
(In reply to comment #26)
> Here are the elements Linda wanted stripped from the XML
files:
> :
> :
> PDQ Key
So, strip the PDQKey elements, but not the PDQKey attributes?
> SummaryFrag Refs ?
What does the question mark mean here? Are the users still thinking about whether they want these stripped? Would it possibly be better to leave them in the documents, and have the software replace the cdr:href values which contain the CDR ID of the English document with a comparable value containing the CDR ID of the (possibly new) Spanish document? This only works, of course, if the cdr:id values of the English document are carried over intact into the Spanish documents, which I assume they would be.
BZDATETIME::2011-10-26 15:20:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::29
I have created a web interface which Linda can use to retrieve an English summary document with the Comment, MediaLink, SectMetaData, ReplacementFor, PdqKey, DateLastModified, and ComprehensiveReviewDate elements stripped. She enters the CDR ID and optionally a version (which can be 'pub' to request the most recent publishable version). There are two options for output: the default is to display the document in the browser, and the other option lets her save the raw XML document to her computer, from which she can then pass it on to Trados. With this tool she can get documents for testing without waiting for me to fetch and process them. Here's the URL:
BZDATETIME::2011-10-27 12:59:36
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::30
(In reply to comment #28)
> (In reply to comment #26)
> > Here are the elements Linda wanted stripped from the XML
files:
> > :
> > :
> > PDQ Key
>
> So, strip the PDQKey elements, but not the PDQKey attributes?
>
> > SummaryFrag Refs ?
>
> What does the question mark mean here? Are the users still thinking
about
> whether they want these stripped? Would it possibly be better to
leave them in
The question mark was to indicate whether it was possible to strip the SummaryFrag Refs.
> the documents, and have the software replace the cdr:href values
which contain
> the CDR ID of the English document with a comparable value
containing the CDR
> ID of the (possibly new) Spanish document? This only works, of
course, if the
> cdr:id values of the English document are carried over intact into
the Spanish
> documents, which I assume they would be.
The fragment ids in the Spanish document would be different from the fragment ids of the English version. Would it be possible to still do the replacement? (In reply to comment #29)
> for me to fetch and process them. Here's the URL:
>
> http://bach.nci.nih.gov/cgi-bin/cdr/get-english-summary.py
Thank you! This is very helpful. Linda has started using it
already.
Could you remove the PDQBoardMember elements from the xml? Also strip
all occurrences of MainTopics except the very first occurrence of
it?
BZDATETIME::2011-10-28 09:18:42
BZCOMMENTOR::Bob Kline
BZCOMMENT::31
(In reply to comment #30)
> The fragment ids in the Spanish document would be different from
the fragment
> ids of the English version. Would it be possible to still do the
replacement?
We agreed in the meeting that you would investigate whether Linda might have been confused about the fragment IDs being altered by Trados.
> Could you remove the PDQBoardMember elements from the xml? Also
strip all
> occurrences of MainTopics except the very first occurrence of
it?
Done. Please give it a try.
BZDATETIME::2011-10-28 12:36:37
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::32
(In reply to comment #31)
> (In reply to comment #30)
>
> > The fragment ids in the Spanish document would be different
from the fragment
> > ids of the English version. Would it be possible to still do
the replacement?
>
> We agreed in the meeting that you would investigate whether Linda
might have
> been confused about the fragment IDs being altered by Trados.
>
I spoke with Linda this morning and we looked at a few examples and it
seems to me now that it would work the way you described in comment #28
and also explained yesterday in the meeting. In view of that please
proceed to make the needed changes.
> > Could you remove the PDQBoardMember elements from the xml?
Also strip all
> > occurrences of MainTopics except the very first occurrence of
it?
>
> Done. Please give it a try.
All changes have been confirmed. Thank you!
BZDATETIME::2011-11-04 12:01:51
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::33
Linda has completed translating the attached document and wants it imported into Mahler for her to review.
Attachment assets.zip has been added with description: document to transfer to Mahler
BZDATETIME::2011-11-04 14:42:26
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::34
(In reply to comment #33)
> Created attachment 2180 [details]
> document to transfer to Mahler
>
> Linda has completed translating the attached document and wants it
imported
> into Mahler for her to review.
Bob, is it possible to expedite this for Linda today?
BZDATETIME::2011-11-04 15:06:26
BZCOMMENTOR::Bob Kline
BZCOMMENT::35
(In reply to comment #34)
> (In reply to comment #33)
> > Created attachment 2180 [details]
> > document to transfer to Mahler
> >
> > Linda has completed translating the attached document and
wants it imported
> > into Mahler for her to review.
>
> Bob, is it possible to expedite this for Linda today?
Imported as CDR696914. Very odd document; lots of the text content has disappeared.
BZDATETIME::2012-03-08 14:16:03
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::36
Lowering priority until further notice.
BZDATETIME::2012-12-31 11:25:40
BZCOMMENTOR::Bob Kline
BZCOMMENT::37
For some reason Linda's not a user in Bugzilla. She has asked for the import of seven new Spanish summaries created in Trados. Bumping up priority so I can handle her request.
BZDATETIME::2012-12-31 11:29:05
BZCOMMENTOR::Bob Kline
BZCOMMENT::38
Attachment newpdqspanishsummariestranslatedinworldservertrados.zip has been added with description: XML files for new summaries to be created in the CDR
BZDATETIME::2012-12-31 11:31:09
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::39
(In reply to comment #37)
> For some reason Linda's not a user in Bugzilla.
A folder was created on the L:Drive for her to drop the XML files in it
and then inform you about them. She is having a problem with the folder
because she is temperorily using a loaner laptop that is why she emailed
you the files. She should be sorted out soon and start using the folder
on the L:Drive.
BZDATETIME::2012-12-31 11:35:17
BZCOMMENTOR::Bob Kline
BZCOMMENT::40
Documents have been imported on Franck. Please have Linda take a look and make sure they appear to have been created correctly before I add them to the production system.
CDR739849
CDR739850
CDR739851
CDR739852
CDR739853
CDR739854
CDR739855
BZDATETIME::2012-12-31 13:56:04
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::41
Linda reviewed them and they all look good. Please promote to Bach.
BZDATETIME::2012-12-31 14:04:39
BZCOMMENTOR::Bob Kline
BZCOMMENT::42
(In reply to comment #41)
> Linda reviewed them and they all look good. Please promote to
Bach.
Done. CDR0000744468 through CDR0000744474.
BZDATETIME::2013-01-03 11:16:34
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::43
I have attached a new file to be imported/created in the CDR
Attachment Colorectal Cancer Screening - HP.xml has been added with description: XML file summary to be created in the CDR.
BZDATETIME::2013-01-03 12:26:08
BZCOMMENTOR::Bob Kline
BZCOMMENT::44
(In reply to comment #43)
> I have attached a new file to be imported/created in the CDR
Added as CDR0000744628.
BZDATETIME::2013-03-12 12:02:04
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::45
Closing this issue. I will enter a new issue if Linda needs a new file.
Marked as Resolved - Fixed
BZDATETIME::2013-03-12 12:02:29
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::46
Closed.
File Name | Posted | User |
---|---|---|
258019v46.xml | 2011-09-23 11:42:34 | |
62832.xml | 2011-09-06 09:32:48 | |
assets.zip | 2011-11-04 12:01:51 | Osei-Poku, William (NIH/NCI) [C] |
CDR62765V165.xml | 2011-09-30 12:27:28 | |
Colorectal Cancer Screening - HP.xml | 2013-01-03 11:16:34 | Osei-Poku, William (NIH/NCI) [C] |
newpdqspanishsummariestranslatedinworldservertrados.zip | 2012-12-31 11:29:05 |
Elapsed: 0:00:00.001753