Issue Number | 4683 |
---|---|
Summary | [Summaries] Modify XML Utility for World Server Translation |
Created | 2019-10-24 14:06:05 |
Issue Type | Improvement |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2020-04-17 11:20:58 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.251521 |
Please modify the XML tool used for generating XML documents for translation in World Server to exclude text marked up with the "approved" Insertion and ignore markup with "approved" deletion.
Need to check on how approved deleted text is being handled. This text still needs to be translated.
I assume we need to get an answer to Robin's implied question before we implement this modification. It seems possible that the answer might be that we shouldn't make this modification after all, because it would prevent us from translating the approved text.
Right, this is something that ~oseipokuw is checking on. While it makes sense to exclude text in approved insertion elements, we want to be sure that text that is in approved deletion markup still gets translated since it is published text.
I have modified the original request to include approved deletion markup as well even though this is rarely found in summaries.
~oseipokuw, are you sure this is what you intended? I think we want to include text that is within approved deletion markup (this is published text) and exclude text that is within approved insertion markup.
There are couple of things I don't understand here.
Why would we not want to translate text which someone has proposed for insertion, if that proposal has been approved? Does "approved" mean something different from "we've decided we want this inserted text/element/whatever"?
Why would we want to treat insertions and deletions the same way, when those two actions are the opposite of each other?
Really? If I propose that we add a paragraph, and you approve my proposal, why wouldn't we want that paragraph translated? Similarly, If I suggest "let's get rid of that other paragraph" and you approve my suggestion, why would we bother to translate the paragraph, if you've decided we're getting rid of it?
Sorry, that is not what I intended. That is my mistake. This is actually the current behavior of the program as I stated in OCECDR-4587(quoted below). Which is, markup with approved deletion is deleted and we want it to be ignored.
"However, we ran into another problem that has to to with "Approved" Insertion and Deletion markup in one of the documents on DEV CDR62932 version 399. The "Approved" Insertion markup text in the summary is included in the XML while the "Approved" Deletion text is deleted from the XML. Is this the expected behavior of the program? While users assign "Approved" revision level attributes to the text, they really not ready to publish yet and they should not be translated yet. "
1. We had a lengthy discussion about this when talking about OCECDR-4587. The short answer is that in certain cases, we don't follow the markup procedure the way it was intended. Essentially, the markup has to be accepted in the document for it be considered final.
2. That was a mistake on my part and I have corrected it. I did not go back to look at the implication of what I had edited. I apologize for the confusion.
OK. To word the requirements more precisely:
For Insertion
elements with a
RevisionLevel
attribute value of "publish" the
Insertion
markup will be removed, and the contents of those
elements will be retained.
All other Insertion
elements will be discarded with
their contents.
Deletion
elements with a RevisionLevel
attribute value of "publish" will be discarded with their
contents.
For all other Deletion
elements the
Deletion
markup will be removed, and the contents of those
elements will be retained.
These rules are applied recursively. So, for example, in the following snippet:
Insertion RevisionLevel="approved">
<Para>
<Insertion RevisionLevel="publish">even though ....</Insertion>
We're throwing this away <Para>
</Insertion> </
the entire paragraph and its Insertion
wrapper will be
discarded.
I have modified the original request to include approved deletion markup as well even though this is rarely found in summaries.
There are indeed fewer approved Deletion
elements than
approved Insertion
element in summaries, but I would say it
would be a mistake to characterize them as "rare." There are 2,219
approved Insertion
elements and 1,896 approved
Deletion
elements in summaries on PROD. So that's over 85%
as many Deletion
elements as Insertion
elements.
Would you mind sharing the numbers in the last published versions of summaries, which is typically the version used for generating the XML for world server?
Also, if you can provide some of the CDR IDs of the ones with approved deletions, that would be good. Thanks!
('Deletion', 'proposed') 1027
('Insertion', 'proposed') 1028
('Deletion', 'approved') 130
('Insertion', 'approved') 149
1 CDR62729 version 108
11 CDR62740 version 82
9 CDR62781 version 49
11 CDR62782 version 90
1 CDR62785 version 50
1 CDR62829 version 383
9 CDR62903 version 223
2 CDR62928 version 42
21 CDR62938 version 116
3 CDR62960 version 211
5 CDR258102 version 200
3 CDR258195 version 237
2 CDR334406 version 11
1 CDR350260 version 107
3 CDR658500 version 67
2 CDR668479 version 24
32 CDR763423 version 26
1 CDR780119 version 12
1 CDR790949 version 15
3 CDR790961 version 17
2 CDR798740 version 21
3 CDR798746 version 21
2 CDR798749 version 16
1 CDR799642 version 13
The first four rows have:
element name
RevisionLevel
attribute value
number of occurrences in the current publishable summary versions
The remaining rows contain:
number of Deletion
elements in the document with
RevisionLevel
of "approved"
document ID
number of the latest publishable version
As you can see, there's not much difference in the number of approved
Deletion
and Insertion
elements. Certainly not
enough to characterize one as "rare" compared with the other.
Thanks, Bob! I was thinking in terms of the approved deletion element at the summary level rather than the number of occurrences in each summary when I said they were rare. However, I do understand that even at 24 summaries (if that is the total number of affected summaries), that is a lot to call it rare. Well, at least they are not in the two thousands 😃. Thanks again for providing this stats.
Yes, this is what we expect. Thanks!
Installed on DEV.
Verified on DEV. Thanks!
Verified on QA. Thanks!
Working as expected on PROD?
We have not been able to test this fix on PROD yet. I am closing the ticket and will reopen if necessary.
Elapsed: 0:00:00.001821