Issue Number | 5104 |
---|---|
Summary | Browser Title for PDQ Summaries - Can we edit this in the CDR? |
Created | 2022-03-17 16:53:24 |
Issue Type | Inquiry |
Submitted By | Juthe, Robin (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2023-06-07 06:53:13 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.313385 |
We discussed this question in today's status meeting and realized we don't know the answer. While there are three TitleType attributes on the AltTitle field in the CDR, it appears that only one of these ("Short") is being sent to Cancer.gov. The short title appears to be used on the CTHP. It's unclear how the browser title field in Drupal is being populated, and which field is used to provide the browser title on a PDQ summary.
Let's revisit this discussion with the Cancer.gov team. Just putting in this ticket as a reminder.
New ticket created for the WCMS team: https://github.com/NCIOCPL/cgov-digital-platform/issues/3513
As discussed, I'm including an earlier email (2022-03-21) here summarizing the data flow of the summary AltTitle from the CDR to Drupal:
Currently, we have three valid values for the TitleType attribute of the AltTitle element:
Short
Display (unused)
Navlabel (unused)
The Short alt title is a required element and the
CDR validation rules restrict the Short alt title to 64 characters or
fewer.
The Navlabel alt title is limited to 100 characters or
fewer but the Drupal software no longer makes any use of this Navlabel
alt title. What now becomes the left navigation label value in Drupal is
instead derived from the Short alt title.
The short title is plugged into the "field_browser_title" for summary documents, which is in turn used for Drupal's left nav label as well as for the CTHP cards (Cancer Type Home Page). So we still have a nav label, but it’s not coming from the CDR AltTitle with the Navlabel attribute.
Originally, the Browser tab was supposed to display the Short AltTitle but it would cause the browser title for HP and Patient to be identical, i.e. “Breast Cancer Treatment” and SEO doesn't like that. This was likely the reason why the browser tab is now showing the full summary title which is often cut off because the browser tab only displays about 66 characters.
The left nav label is typically identical to the short title (a.k.a
CTHP title) but it can be overwritten manually. Once the left nav label
has been overwritten it won’t be changed anymore when the short title is
updated. In short: A CDR document that has a modified short title will
always update the CTHP title. It will only update the left nav title if
that title is identical to the CTHP title and not set manually.
An example for a summary with differing document title, short title, and
left nav label is “Wilms Tumor and other Childhood Kidney Tumors” https://www.cancer.gov/types/kidney.
Doc title: Wilms Tumor and Other Childhood Kidney Tumors Treatment
(PDQ®)–Patient Version
Short title: Wilms Tumor and Other Childhood Kidney Tumors Treatment
Nav title: Wilms & Other Childhood Kidney Tumors Treatment
Setting the nav_title is done by going to www-cms-dev.cancer.gov,
select the menu Structure -> Taxonomy -> SiteSection
-> Home (click the children link) -> Cancer Types
(click the children link) -> find the cancer type, i.e. Kidney
Cancer (click the children link) -> Patient or HP (click the
children link) --> click the Edit button for your summary
Hi everyone,
My last email described in detail where the SummaryTitle and AltTitle content ends up in Drupal. The understanding of the data flow for these elements, however, was just a related question to the original question from Robin if it is possible to control the browser title from within the CDR.
As we have seen, there are two elements in Drupal storing title information. These are the elements
Title
This element holds the full document title (CDR SummaryTitle plus
PDQ©-audience) and is also used to populate the HTML “<title/>”
element which is created by using this field and concatenating the site
name “ – National Cancer Institute”. The content of this element, or at
least the first approx. 60 characters, are used for the text displayed
on the tab and the hover text of the tab.
Example:
Doc title: Childhood Midline Tract Carcinoma Involving the NUT Gene (NUT
Midline Carcinoma) Treatment (PDQ®)–Health Professional Version
Tab title: Childhood Midline Tract Carcinoma Involving the NUT Gene (NUT
Midline Carcinoma) Treatment (PDQ®)–Health Professional Version -
National Cancer Institute
Browser Title
The description for this element indicates that it is used as the
browser title (concatenated with the site name “National Cancer
Institute”) but that is incorrect (see above). The field contains the
CDR AltTitle (type=Short). As mentioned yesterday, the original idea
was probably to use this information as the browser title but when that
decision changed the element’s name and description didn’t get
adjusted. This text is used for the navigation links.
Example:
Short title: Childhood Midline Tract Carcinoma Treatment
The question for Lindsay is now:
If the CDR were to provide an additional AltTitle
(type=Tab), for example, would Drupal have a place to
store this information in order to control the browser title?
Somehow Jira decided 2 comments are enough. It won't let me add a third one ... or will it?
As far as the question about whether Drupal would have a way to store an additional AltTitle from CDR to control the Browser title on summaries, I think this is something for the larger Digital Platform product dev team and probably not really something Lindsay can answer. A preliminary discussion of what would be involved/how it would work could be a good first step, and then we’ll likely need a Cancer.gov Digital Platform ticket, assuming this isn’t something that can be done with no changes.
PS – Overriding nav labels in the Drupal CMS is obviously a closely held function for IAs and not something most folks are allowed to do. Just wanted to mention that in case you were wondering or worried about changes being made without approval.
The previous comment was Amy's response to my last email but Jira refused to add it via copy/paste. I hope you all did not receive the comment multiple times. If you did blame Jira. 🙂
The steps for implementing this will include:
add the new field to
the pdq_cancer_information_summary
type
(Drupal/Bob)
modify the REST plugin code to accept and store the new field's value (Drupal/Bob)
modify the twig template(s) to adjust which field values are used where (Drupal/TBD)
modify the Summary
schema and XMetaL CSS to support
the new value (CDR/Bob)
populate the summary documents with the new value (CDR/PCIB+CIAT)
modify the publishing filters to pick up and export the new value (CDR/Volker)
We had spoken (in an earlier CDR/EBMS status meeting), I believe,
about using type
values for the AltTitle
element reflecting the usage of titles on the web site. So, for example,
we might run a global change renaming "Short" to "Browser" or perhaps
cloning the "Short" AltTitle
elements as "Browser"
AltTitle
elements, and adding "CTHP" as another valid
type
value. Let's discuss the options in this afternoon's
status meeting.
The previous comment was as close as we have to a "recommendation" as
to how to proceed (referencing comments from this afternoon's status
meeting). Subsequent discussions (particularly comments from William)
have us leaning in the direction of leaving what's in place now intact
(so, continuing to allow the "Short" value, and cloning those elements
with the new type name ("Browser") rather than modifying the existing
AltTitle
elements in place) which would allow us to make
the schema changes and populate the new title elements in advance,
anticipating what will be exported when the Drupal end of things is
ready, instead of needing to coordinate the timing of these changes
carefully with the web site. As I mentioned in the meeting, ~juther, it sounds like the
Digital Platform team will be ready to get the ball rolling on their end
when we have given them clear indications that our own armies are on the
move.
Hi ~bkline , I think we can use the following names for the AltTitle values: Browser and CancerTypeHomePage. (I prefer to spell it out for clarity, even though it's long.) As I mentioned CTHPs are gradually being replaced, but that's a slow process and we can always revise the name of this attribute if/when we need to. 🙂
As for the globals, would it be possible to provide a spreadsheet with existing data for the AltTitle fields with Short and Navlabel attributes? While intuitively it seems to make sense to replace Short with Browser, I'd like to get a better sense of how the fields are currently populated before I say what should map to what. Thanks!
Spreadsheet attached. Shows all the titles for all the summaries.
The new AltTitleType
valid values have been installed on
DEV.
Any guidance on this, ~juther? I've got my weekly Digital Platform status meeting in an hour, and I'd like to know what progress I can give them. Thanks! 😉
Hi ~bkline . I've reviewed the spreadsheet and I think we can just rename the title labeled as "Short" with "Browser". For the few summaries that have an AltTitle without an attribute, we can leave them as is (if allowable by the schema) or move them to the new "Browser" title type.
It doesn't really matter what we do with the blocked summaries. For simplicity sake, maybe we just migrate those titles over as well? Or we could just leave them be.
For this ticket we will:
copy the value in the "Short" alt titles into the "Browser" and "CTHP" alt titles
remove the title with "Short" as the title type
remove the rule requiring exactly one Short title
add rules requiring exactly one Browser title and at most one CTHP title
leave the short title type as allowable so we can bring up older version without hassles
do nothing at all with blocked documents
Should we create a ticket for the global or there is already a ticket?
I'm doing it as part of this ticket.
Though a separate ticket would probably have been better. Next time. 😛
I created the global change job and started it in test mode on DEV. When I came back to check on the status of the job after dinner I found that the job had been killed, a victim of a recent change in the configuration of the servers. We used to be able to keep a login session alive on a CDR server as long as we didn't leave an idle RDP session connected to it. That's was broken by a change CBIIT made on the 3rd of this month. I have put in a ticket to have CBIIT restore the access I used to have to the databases from my workstation so that I can execute long-running jobs directly from my laptop without having them killed. We'll have to write the global change scripts very carefully, so that if the VPN connection dies in the middle of a live job we don't leave corrupted data behind, and we can detect which documents have already been processed and skip them if we have to resume a job in such a situation. 🙁
Sure. I wanted to mention that it would be good to run the global in the following groups:
Language
Then by Audience
Then by Summary Type (not required - you can skip this one if it will get complicated)
OK. Is there any reason you couldn't have told me this before I wrote the script? 😛
You were too fast for me 🙂. I mentioned it in the CDR meeting when we discussed the global.
Test results on DEV:
Please run the global in live mode on DEV. Thanks!
Done.
We have verified the changes on DEV. However, we are concerned that every summary touched by the global will have a second Alt Title element (with attribute value of CTHP) even if the title is not used on any card on Cancer.gov, and even if currently, there is only one Alt Title element. The preference would have been to only include the CTHP title when it is used on a card on Cancer.gov so that the data in the CDR will match what is on Cancer.gov. When I first raised this issue there was a suggestion to modify the filters to be able to use the Browser title for the CTHP cards if we don't have a CTHP Alt Title in a document. Having the data match what is on Cancer.gov would help inform users when making changes to the Alt Titles. Other than that, we are ready for a test run on QA.
As I got deeper into the work needed to make the Drupal-side modifications, it became clear that it's not sufficient to simply copy the short title to the browser title. We will need to come up with browser titles which satisfy the requirement that the browser titles be unique, and which are short enough that they will satisfy the 100-character limit imposed on that field by Drupal. Otherwise, we will run into the same issue which caused the web site to switch to using the node title for the browser title instead of the value in the browser title field. Tagging ~buracklb for awareness.
One approach which might go some way toward achieving uniqueness without exceeding the length limit would be to append "(Patient)" (or the Spanish equivalent) instead of " (PDQ®)–Patient Version" to the patient summary titles and not append anything to the HP summary titles. This won't be enough for all of the titles, but it will handle most of them, and we can come up with manually created titles for the handful which would otherwise still be too long. Other suggestions?
~buracklb and ~juther I have attached a spreadsheet illustrating an approach to achieving the goal of ensuring uniqueness across the browser titles used for the cancer information summaries sent to Drupal. As noted in earlier comments, we have two constraints in tension with each other:
the titles need to be unique
the titles cannot exceed the length of 100 characters
Uniqueness can be achieved by taking the existing values formerly stored in the "Short" alternate title (still stored there in production) and appending qualifiers to at least some of those values to distinguish them from others without the appended qualifier. So, for example, if we have two English summaries, each with the same "short" title, one for patients, and the other for health professionals, we could add " (patient)" to the patient summary and the titles for the two summaries would be distinct from each other.
The more information we include in the appended qualifiers, the more titles we will push over the length limit, requiring us to manually construct a shorter version of the browser title. The approach represented by the spreadsheet appends " (patient)" or " (paciente)" to the patient summary titles, with the result that none of the resulting titles exceed the length limit.
This is just an example of how we can get to the goal, and I am not advocating that we necessarily use this exact approach. My intention is to get the discussion started for how to produce the titles we want. I realize that this particular approach does much less appending to the titles than is done for the main node titles for the summaries. Many of those titles, however, are too long to be stored in the browser title field. I am hoping that ~buracklb can help guide us with information on how the different choices we can make for achieving unique browser titles affect issues which are important (SEO, usability, accessibility, etc.).
Note that while the resulting titles are all short enough, some of the titles are still not unique. We will need to figure out why there are duplicates for the same title, and how we can write custom validation rules which ensure we do not publish duplicate titles, but do not get in the way of normal work to maintain the summaries.
The rows with MISSING or FAILURE in the title can be ignored for this purpose. Those are the ones which don't have any "Short" alternate title at all (or, in the case of the title with "FAILURE," no XML at all).
All of the summaries in the spreadsheet have an "Active" status in the production CDR.
Following up on our discussion in yesterday's weekly status meeting: we decided that we will distinguish patient summaries from health professional summaries by appending " (PDQ®)" to the HP titles as they are copied from the short title to the browser title during the global change for this ticket. As promised, I have analyzed the duplicate titles which appear on the spreadsheet I posted yesterday to determine the reason(s) for those duplicates. Here's what I found.
in some cases, one of the summaries is marked as a partner merge set
in some cases, one of the summaries is marked as a future replacement for the other
in some cases, both summaries are marked as only usable as modules
in some cases (for example, Cancell/Cantrol/Protocel) both languages have the same title
in a couple of cases an English summary has a Spanish title (CDR810760, CDR811723)
in one pair, one of the summaries (CDR763238) has "*TEMP*" in the main summary title
For most of these cases, at most only one of the summaries in a pair will actually be published with the browser title in question. In the handful of cases where the English and Spanish summaries have the same language independent title (Cancell/Cantrol/Protocel, PC-SPES, Angiosarcoma, 714-X), we can add " (español)" to the browser title for the Spanish summary.
The conditions which can explain pairs of duplicate titles make it impossible for the current validation subsystem to detect which duplicate titles should be treated as validation errors and which are benign. To achieve this capability we would need to create an elaborate extension to that subsystem, introducing significant additional complexity as well as a possibly noticeable hit in performance at document save time. So my next question for you, ~buracklb, is whether the uniqueness of the browser titles is a desirable condition as opposed to an inflexible block to publication. If the former, what I would propose is a nightly report which is sent to a distribution list showing pairs of summaries which have been published with the same browser title, similar to the nightly job which reports duplicate glossary term names repeatedly until the duplicates are resolved.
~bkline – I understand the duplicate title scenario we discussed yesterday, where both English and Spanish have the same title, and that is only a very small handful of summaries. I'm not a CDR product owner, but scenarios such as the last two bullets (an English summary has a Spanish title; one of the summaries has TEMP in the main summary title) seem more like errors that should be deleted than issues we need to worry about. Summaries that are marked as a future replacement should have distinct titles until they are published and the old version is deleted. This should all be verified with a CDR product owner or power user though, especially as I'm not familiar with scenarios/usage for the first and third bullets.
If my gut is correct though, we're only left to deal with the few summaries with duplicate titles in both English and Spanish.
Easiest resolution is to add "(español)" as you've suggested, ~bkline.
From an SEO standpoint, we want browser titles to be as descriptive and unique as possible. They should give users and search engines a clear idea of what the page content will be about. While these particular topics may not have high search volumes, we can always go the route of more descriptive browser titles. Here are some options/examples:
714-X and Cancer Care
714-X and Cancer Treatment
714-X Alternative Cancer Treatment
714-X Cancer Treatment Review
714-X - Complementary and Alternative Medicine
714-X Lacks Study Support for Cancer Treatment
Lastly, I would not advocate for any elaborate build as you've described. I believe we only have a few duplicate titles to resolve and after that, this is something content authors should be aware of and resolve before implementation of content into the CDR.
~bkline - would you be able to (mostly) reuse duplicate glossary term name report for the duplicate browser report? Or would a duplicate browser report require additional effort? Once the summaries are in Drupal, we can easily request a report to identify any duplicate titles. This wouldn't be automated, but it's a simple request. Again, I believe all content creators/authors should be aware of the duplicate title issue and identify and resolve such issues before the content is entered in the CDR. Because of that, I don't think extra development effort is necessary. But CDR owners/users should have the final say over me.
Thanks Bob! Please see my comments about the data issues below:
in some cases, one of the summaries is marked as a partner merge set - I assume this is OK since the partner summary won't be published to Cancer.gov.
in some cases, one of the summaries is marked as a future replacement for the other - This is OK as the issue will be resolved when the replacement is completed.
in some cases, both summaries are marked as only usable as modules - This should be OK as long as they are marked as Module Only. However, we will review to them.
in some cases (for example, Cancell/Cantrol/Protocel) both languages have the same title
in a couple of cases an English summary has a Spanish title (CDR810760, CDR811723) - One of these is a duplicate and other was still in the process and not published yet. It will be corrected before publishing.
in one pair, one of the summaries (CDR763238) has "*TEMP*" in the main summary title - This is OK. They will be resolved eventually like the replacement one above.
I have transformed the summary documents again on CDR DEV, using the approach we settled on for making the browser titles unique, and I set up an ODE and populated it with these summaries. Please review the documents in the CDR and on the ODE. In order to do test runs on QA we would need to install the software and schema changes on that tier. Would that disrupt any testing of changes for other tickets? ~volker?
Not if you stop finding bugs in the filter code. 🙂
I have filter changes on QA for the Special Considerations that ~oseipokuw is looking at but that can easily be restored if needed.
~volker I have a
browser-title
branch in the cdr-server
repository (for the summary schema changes), in the cdr-lib
repository (for the cdrpub.py
changes), and in the
cdr-tools
repository (to update the tool which creates
sample YAML content for Drupal). Are there any filter changes for this
ticket which need to get checked into this branch in the
cdr-server
repository? Or any other changes you've made for
this ticket?
I didn't make any changes for this ticket and if you didn't have any filter changes there shouldn't be any overlap.
I have installed the schema changes on CDR QA and I'm running the
revised global change job in test mode on that tier. As I'm monitoring
the job I can see that there are some "Short" AltTitles which contain
markup, including some with
{}Insertion{
}/{}Deletion{
} markup going back
years. Here's one of the more recent examples (reformatted for
readability):
AltTitle xmlns:cdr="cips.nci.nih.gov/cdr" TitleType="Short">
<GeneName>
<Comment user="isaacsjs" audience="Internal" date="2022-05-02">
<
This module is not linked to Gen of Skin anymore - JIComment>
</
PTENGeneName>
</
hamartoma tumor syndromes (including Cowden syndrome)-IntroAltTitle> </
It would prohibitively expensive (and somewhat risky) to try and come up with logic which could reliably do the right thing with every possible combination of inline markup, so we have two feasible approaches that I can think of.
Eliminate the markup in advance of running the global change.
Have the global change skip documents with this problem, and fix them by hand.
Just as a reminder, in case this might affect your decision about which option to use, what we publish to the web site has all of that markup stripped out. I considered a third option, which was to have the global change software strip the markup, but when I realized that some of the markup was revision markup, it became clear that we would end up with garbled (or even empty) text content for the titles. I suppose we could adopt this approach of stripping the markup after having resolved the revision markup, but that would assume we know which revision markup should be applied and which backed out (presumably the old revision markup is still there—for example in CDR62936—because a decision hasn't yet been made which way to go).
Looks like there are 13 such documents on QA. The job just finished.
https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2023-03-01_08-01-11
I have attached a text file showing that markup.
Capturing the decision made in Thursday's status meeting: the next step is that the inline markup will be removed from the Short titles on QA, right?
It looks like both the browser title and CTHP title are required in the schema. I thought only the browser title was supposed to be required.
Drupal requires both values.
Markup removed from the affected 13 summaries on QA. Is the schema change on STAGE ?
Not yet. I think it will be easier to edit the Short titles before we make that change, right?
Some of the diffs appear not to include the text within the AltTitle tags for the CancerTypeHomePage, although the New XML appears to be fine and include the data.
Example: CDR0000790949
- <AltTitle TitleType="Short">
+ <AltTitle TitleType="Browser">Childhood Breast TumorsCancer Treatment</AltTitle>
+ <AltTitle TitleType="CancerTypeHomePage">
That's because the diff software tries to normalize the serialized XML by putting each element on a separate line, as part of the attempt to provide the shorter lines you have requested. In this case, the "before" version for the "Short" title had the same inline markup which the "after" version had in the "CancerTypeHomePage" title, so only the lines with the opening tag changed. The content and closing tags (moved to separate lines) were the same, so they don't show up on the "diff" page. This won't happen for the diff reports when the unwanted inline mark has been removed from the documents being transformed.
Or to put it more succinctly, reduced context is a price paid for shorter diff lines.
Please run in live mode on QA. Thanks!
It's actually running in test mode on QA right now, since we haven't done that since the titles got fixed.
As soon as you have reviewed the test results I'll run the global change in live mode on QA and we can move on to the next steps of pushing everything to STAGE.
Test results looks good. Please run in live mode on QA. Thanks!
Global change has run in live mode on QA. Please review the results.
By the way—when you edit the existing Short titles to eliminate inline markup, don't add the new Browser or CancerTypeHomePage titles. Otherwise you'll end up with extra title elements after the global change has run.
~bkline , if you are referring to a single document then that was likely me who did that. I thought the documents on QA had already been converted when I created a test summary to work on the Bold/Underline copy/paste issue.
I believe some of them may have been added when we were getting the Special Consideration documents ready for publishing. We wouldn't have been able to create publishable versions without adding the new alt titles. We will have to go back and review those summaries now that the global has completed.
~oseipokuw Just a reminder that I'm holding off on the preparations of QA for user acceptance testing of Pauling (which is scheduled to start tomorrow) until you've finished with your review of the live global change on that server.
Yes, Understood! However, we are starting to test on DEV while we complete the browser title testing ASAP. We will also want a refresh of QA so I will create a ticket for that.
I assume you will need to publish the summaries to the ODE before we can review them on there as pub preview appears to show the old display.
I assumed you wouldn't want me to do that until you have reviewed the documents in the CDR.
They look good on QA. Please proceed to publish them to the ODE.
~volker The DTD changes
for this ticket don't seem to have made it into GitHub. Please create a
browser-title
branch in the cdr-publishing
repository and check in those changes. I have temporarily made the
necessary edits on QA in the file system so that I can publish from
there.
I assume you're going to need to modify both DTDs, since the data partners will be getting the new attribute values, too, right?
The push job failed when it tried to push 62975. That's because the document had no browser title. Looking back at the global change job's logs, I see it couldn't save a new published version because
: /GlossaryTermName/TermNameStatus != "Rejected" (2 times) Failed link target rule
Looks like that error was also logged during the test run. (Always a good idea to investigate logged errors in the test results.)
So we have two paths we can take:
fix the broken documents
rewrite the publishing software to work around the broken documents
I have fixed the problem and created a publishable version of 62975. I had to do the same for CDR0000410719 which also contained a rejected term. So, it is likely that there are more of these. I investigated and fixed several of these before the live run.
Generally, we don't fix problems like these on QA unless they will block live run of the global change, because that will be double work. We rather fix them on PROD if they exist on PROD. In this case, we are also running a publishing job, so I understand why this is an issue on QA now.
Is there a reason why the error is displayed only in the LASTP row? It does not appear to be an issue in the CWD and LASTV which we look at more carefully. Most of the errors are displayed in all the different versions/rows.
Just to be clear: for the purpose of this exercise, it's only the documents which have errors preventing the creation of a new publishing version which would need to be fixed.
Sure. If there are more, let me know and I will fix them.
Did you look at the version history report? When Chanita fixed the glossary term links she didn't create a publishable version for some reason.
Did you fix all of the documents which had errors in the LASTP row of the test results for the global change?
I don't see a ticket for DTD changes, that's why "those changes" don't exist yet.
As for the partner documents, I'm pretty sure we don't want to send the partners the "CancerTypeHomePage" AltTitle and we may want to continue giving them a "Short" title. I'll have to look first what we're currently sending. It wasn't on my radar but I hear it beeping now! 🙂
I suppose that explains why only the LASTP will show the error. I assume it is OK now to publish to the ODE.
$ ls CDR*.pub*ERROR*
CDR0000062808.pub.NEW_ERRORS.txt CDR0000787346.pub.NEW_ERRORS.txt
CDR0000062841.pub.NEW_ERRORS.txt CDR0000799416.pub.NEW_ERRORS.txt
CDR0000062872.pub.NEW_ERRORS.txt CDR0000799716.pub.NEW_ERRORS.txt
CDR0000062975.pub.NEW_ERRORS.txt CDR0000799767.pub.NEW_ERRORS.txt
CDR0000062978.pub.NEW_ERRORS.txt CDR0000805475.pub.NEW_ERRORS.txt
CDR0000410719.pub.NEW_ERRORS.txt CDR0000805686.pub.NEW_ERRORS.txt
CDR0000446177.pub.NEW_ERRORS.txt CDR0000809230.pub.NEW_ERRORS.txt
CDR0000517309.pub.NEW_ERRORS.txt CDR0000809329.pub.NEW_ERRORS.txt
CDR0000700000.pub.NEW_ERRORS.txt CDR0000810015.pub.NEW_ERRORS.txt
CDR0000752413.pub.NEW_ERRORS.txt CDR0000810042.pub.NEW_ERRORS.txt
CDR0000774255.pub.NEW_ERRORS.txt
I assume it is OK now to publish to the ODE.
I tried (again). It failed (again).
It looks like JIRA is eating more comments (I posted a reply last night but it's gone). I will assume that this is why you didn't see my earlier question:
Did you fix all of the documents which had errors in the LASTP row of the test results for the global change?
To save you from having to scroll through the test results report I have listed all of those documents in my previous comment. I do not want to have you fix one document at a time, ask me to try publishing again, the publishing job fails again, then you fix one more document and ask me to try again, on and on.
Once you have fixed ALL the documents I will try publishing to the ODE again.
As for JIRA, my working theory is that trying to keep up with threading of replies may be causing (or at least contributing to) its failures to display all of the comments, so I'm going to avoid using the Reply feature (at least for a while) to see if that improves JIRA's behavior. I will instead create standalone comments, using quoting of relevant passages from earlier comments to provide any necessary context.
Please try again. I believe I fixed all the errors and created new publishable versions and double checked by running pub preview for each of them. Some of the errors did not show up during validation checks and even created publishable versions but Pub preview failed for those documents until the errors were fixed.
The summaries have been pushed to the ODE.
What is a good way to get to the list of summaries on the ODE? Searches provide results that point to cancer.gov and. It looks like you need to know the URL of the summary and replace the cancer.gov name with the ODE URL before you get to a summary.
What is a good way to get to the list of summaries on the ODE?
I have created the query Summaries on Drupal at https://cdr-qa.cancer.gov/cgi-bin/cdr/CdrQueries.py.
I modified the query to add a URL column.
~bkline, I modified the DTD for Cancer.gov (pdqCG.dtd) and pushed it in the branch 'browser-title'. I also modified the filter CDR0000609947 to recreate the Short title for the PDQ partners. I also created a branch "browser-title" for this change but later learned there's already a branch with that name in the repository and now GH won't let me push the change. I will figure out how to get GH to cooperate tomorrow, probably by removing my branch and pulling yours.
The updated filter is currently on DEV.
there's already a branch with that name in the repository
This is truly odd. I have no trace of such a branch in any of my
clones of the cdr-publishing
repository, nor did I see one
when I looked on GitHub the other day. I see it on GitHub now, however,
with your commit from last night. Furthermore, when I go to the https://github.com/NCIOCPL/cdr-publishing/branches/yours
I only see Pauling, so it would appear that GitHub is under the
impression that you (or at least someone other than
{}bkline{
}) created that branch. Do you have more than one
clone of this repository? Is it possible you created the branch in one
and it confused the other? I have three clones: one on a network share
which I can use from any machine on the NCI network, one on the
government's laptop, and one on my own MacBook. None of them have that
branch. I will create a fourth (temporary) clone so I can see if it
gives my any clues about the history of the repositories.
Ah, never mind. You're not talking about the
cdr-publishing
repository, you're talking about the
cdr-server
repository, where I had indeed created the
browser-title
branch to store the schema changes.
I recommend using
git diff HEAD~ -p > /somewhere/browser-title.patch
Then:
remove the local branch
pull down the branch from GitHub
apply the patch
commit
push
"Apply the patch" would be
patch -1 < /somewhere/browser-title.patch
We've reviewed a good sample of the pages on the ODE and did not find anything odd. They all looked good. So, I think we can proceed to move this to STAGE. Thanks!
Here's the report for the test run of the global change job on STAGE. Better than on the lower tiers, but still a couple of summaries whose latest publishing version failed to get updated.
https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2023-03-31_18-22-16
Ah, I just noticed that the job failed when the database went away partway through. So I'm running it again. That might account for the fact that there were fewer last published versions with problems.
The job ran for about an hour, then failed again. Trying again for a third time.
Third time worked. https://cdr-dev.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2023-04-03_13-03-13. As you can see, there are more LASTP errors than the two I saw for the first (failed) run.
I think I fixed all the LASTP errors on STAGE. Would you want to run in test mode again just to be sure?
https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2023-04-07_07-44-22
I ran it from QA this time to see if that would avoid the failures I got running from DEV, and it succeeded the first time. Just under 3 hours (I misspoke a day or two ago when I said it took 2 hours on DEV: it took about the same amount of time on DEV as it took today on QA). I'll run from the QA server when we get to the production rollout.
Please check these results.
Again, the data involved is on STAGE, even though the job was launched on QA.
I have fixed the two LASTP errors in this latest run on STAGE. Please run in test mode again.
Latest test run against STAGE:
https://cdr-qa.cancer.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2023-04-07_12-55-11
It looks there are no more LASTP errors so you may proceed with the live run on STAGE.
Live run complete on STAGE. Logs attached. Let me know when you've checked the results and are ready for me to publish to the ODE.
I reviewed some of the documents on STAGE. They looked good. Thanks!
I have loaded the summaries on STAGE to the ODE.
https://ncigovcdode539.prod.acquia-sites.com/
Giving you a change to check these over, ~oseipokuw, before I turn it over to ~buracklb & co.
Will the query "Summaries on Drupal" retrieve the right information from the ODE for STAGE? I ran the one on QA (as a test) but it is no longer giving me the ODE URLs. Rather I am getting URLs pointing to the live site.
Sure. Just copy of the query to STAGE and uncomment (remove the "–" delimiters) the three lines for the REPLACE calls (you can comment out the following line which gives you the original URL).
Worked. Thanks!
I have reviewed several summaries on the ODE, and they all looked good. Thanks!
I ran a weekly publishing job on STAGE and inspected the output to confirm that the new TitleType (Browser and CTHP) for the AltTitle element have been replaced for the partner output and are displayed as a single AltTitle with "TitleType=Short".
Deployed to production.
We published new summaries this morning and I was able to confirm that the right Browser Titles display on Cancer.gov. Thanks!
File Name | Posted | User |
---|---|---|
alt-titles-with-markup.log | 2023-03-01 09:43:26 | Kline, Bob (NIH/NCI) [C] |
browser-title-errors.png | 2023-03-21 11:34:36 | Kline, Bob (NIH/NCI) [C] |
BrowserTitlePP_QA.PNG | 2023-03-20 10:39:58 | Osei-Poku, William (NIH/NCI) [C] |
browser-titles.xlsx | 2023-02-23 12:51:21 | Kline, Bob (NIH/NCI) [C] |
cthp error.PNG | 2023-03-09 17:31:08 | Osei-Poku, William (NIH/NCI) [C] |
fixed-without-creating-publishable-version.png | 2023-03-21 13:29:02 | Kline, Bob (NIH/NCI) [C] |
image-2023-04-10-17-26-13-744.png | 2023-04-10 17:26:14 | Osei-Poku, William (NIH/NCI) [C] |
ocecdr-5104-stage-live.log | 2023-04-08 01:45:24 | Kline, Bob (NIH/NCI) [C] |
summary-alt-titles.xlsx | 2022-11-11 17:56:24 | Kline, Bob (NIH/NCI) [C] |
Elapsed: 0:00:00.001997