CDR Tickets

Issue Number 3825
Summary Annual NLM DTD changes that affect citations import
Created 2014-11-05 17:44:04
Issue Type New Feature
Submitted By Osei-Poku, William (NIH/NCI) [C]
Assigned To Kline, Bob (NIH/NCI) [C]
Status Closed
Resolved 2014-11-06 09:28:20
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.141236
Description

I have posted below, email exchanges we had regarding NLM's annual DTD changes.

______________________________
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient Government Services
NCI: 240-276-6583

---Original Message---
From: Osei-Poku, William William.Osei-Poku@icfi.com
Sent: Tuesday, November 04, 2014 2:07 PM
To: Juthe, Robin (NIH/NCI) [E]; Englisch, Volker (NIH/NCI) [C]; Kline, Robert (NCI); Beckwith, Margaret (NIH/NCI) [E]; Alan Meyer; Henry, Erika (NIH/NCI) [C]
Subject: RE: [Utilities-announce] PubMed E-Utilities 2015 DTD updates

I agree that it should go in the hotfix. In the past we were able to import citations but they remained invalid and unable to use them until our schema was updated.

Thanks,
William

---Original Message---
From: Juthe, Robin (NIH/NCI) [E] robin.juthe@nih.gov
Sent: Tuesday, November 04, 2014 1:30 PM
To: Englisch, Volker (NIH/NCI) [C]; Kline, Robert (NCI); Beckwith, Margaret (NIH/NCI) [E]; Osei-Poku, William; Alan Meyer; Henry, Erika (NIH/NCI) [C]
Subject: RE: [Utilities-announce] PubMed E-Utilities 2015 DTD updates

I think this should go into the hotfix if possible. In the meantime, we run the risk of importing duplicate citations or not being able to import citations at all, right?

Thanks,
Robin

---Original Message---
From: Englisch, Volker (NIH/NCI) [C]
Sent: Tuesday, November 04, 2014 1:27 PM
To: Kline, Robert (NCI); Beckwith, Margaret (NIH/NCI) [E]; Juthe, Robin (NIH/NCI) [E]; Osei-Poku, William; Alan Meyer; Henry, Erika (NIH/NCI) [C]
Subject: RE: [Utilities-announce] PubMed E-Utilities 2015 DTD updates

Do we need to hold up the Mailer hot-fix to include the citation schema hot-fix? I don't think we would be able to launch another hot-fix before December.

Thanks,

Volker
___________________________________
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient Government Services
NCI: 240-276-6583

---Original Message---
From: Kline, Robert (NCI)
Sent: Friday, October 31, 2014 12:17 PM
To: Beckwith, Margaret (NIH/NCI) [E]; Juthe, Robin (NIH/NCI) [E]; Osei-Poku, William; Alan Meyer; Englisch, Volker (NIH/NCI) [C]; Henry, Erika (NIH/NCI) [C]
Subject: Fwd: [Utilities-announce] PubMed E-Utilities 2015 DTD updates

Just a heads-up that we may need to update our Citation schema in the CDR some time before December. Assuming no CDR release or patch is scheduled between now and then, this will require a ticket to CBIIT to get some of the mods installed on production.

Bob

                    • Forwarded message ----------
                      From: <utilities-announce@ncbi.nlm.nih.gov>
                      Date: Fri, Oct 31, 2014 at 11:50 AM
                      Subject: [Utilities-announce] PubMed E-Utilities 2015 DTD updates
                      To: NLM/NCBI List utilities-announce <utilities-announce@ncbi.nlm.nih.gov>

Dear NCBI PubMed E-Utilities Users,

We anticipate updating the PubMed E-Utilities DTDs for 2015 in mid-December, approximately on December 15, 2014.

The forthcoming DTDs are now available:

http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/bookdoc_150101.dtd

http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/nlmmedlinecitationset_150101.dtd

http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/pubmed_150101.dtd

The DTD changes for the 2015 production year are itemized in the Revision Notes section near the top of the DTDs. The following describes the substantive changes:

Add new UI attribute to DescriptorName, QualifierName, NameOfSubstance, SupplMeshName and PublicationType elements. The new required UI attribute will carry the MeSH unique identifier for DescriptorName, QualifierName, NameOfSubstance, SupplMeshName and PublicationType elements.

DTD:

<!ELEMENT DescriptorName (#PCDATA)>

<!ATTLIST DescriptorName

MajorTopicYN (Y | N) "N"

Type (Geographic) #IMPLIED

UI CDATA #REQUIRED>

<!ELEMENT QualifierName (#PCDATA)>

<!ATTLIST QualifierName

MajorTopicYN (Y | N) "N"

UI CDATA #REQUIRED>

<!ELEMENT NameOfSubstance (#PCDATA)>

<!ATTLIST NameOfSubstance

UI CDATA #REQUIRED>

<!ELEMENT SupplMeshName (#PCDATA)>

<!ATTLIST SupplMeshName

Type (Disease | Protocol) #REQUIRED

UI CDATA #REQUIRED>

<!ELEMENT PublicationTypeList (PublicationType+)>

<!ELEMENT PublicationType (#PCDATA)>

<!ATTLIST PublicationType

UI CDATA #REQUIRED>

Sample XML:

<DescriptorName MajorTopicYN="N" UI="D054971">Orthostatic Intolerance</DescriptorName>

<QualifierName MajorTopicYN="N" UI="Q000628">therapy</QualifierName>

<NameOfSubstance UI="C058787">royal jelly</NameOfSubstance>

<SupplMeshName Type="Disease" UI="C537735">Oculofaciocardiodental
syndrome</SupplMeshName>

<PublicationType UI=”D016428”>Journal Article</PublicationType>

Add new optional and repeatable envelope element AffiliationInfo to Author and Investigator elements. AffiliationInfo envelope element includes Affliliation and Identifier elements.

DTD:

<!ELEMENT Author (((LastName, ForeName?, Initials?, Suffix?) |
CollectiveName), Identifier*,

AffiliationInfo*)>

<!ATTLIST Author ValidYN (Y | N) "Y">

<!ELEMENT Investigator (LastName, ForeName?, Initials?, Suffix?,
Identifier*, AffiliationInfo*)>

<!ATTLIST Investigator ValidYN (Y | N) "Y">

<!ELEMENT AffiliationInfo(Affiliation, Identifier*)>

<!ELEMENT Affiliation (#PCDATA)>

<!ELEMENT Identifier (#PCDATA)>

<!ATTLIST Identifier

Source CDATA #REQUIRED)>

Sample XML:

<AuthorList CompleteYN="Y">

<Author ValidYN="Y">

<LastName>Rome</LastName>

<ForeName>Benjamin N</ForeName>

<Initials>BN</Initials>

<Identifier Source="ORCID">0000000111111111</Identifier>

<AffiliationInfo>

<Affiliation>Harvard Medical School, Boston, Massachusetts</Affiliation>

<Identifier Source=”Ringgold”>123456</Identifier>

</AffiliationInfo>

<AffiliationInfo>

<Affiliation>Program on Regulation, Therapeutics, and Law, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts</Affiliation>

</AffiliationInfo>

<AffiliationInfo>

<Affiliation>Beth Israel Deaconess Medical</Affiliation>

<Identifier Source=”Ringgold”>678922</Identifier>

</AffiliationInfo>

</Author>

</AuthorList>

<InvestigatorList>

<Investigator ValidYN="Y">

<LastName>Salloway</LastName>

<ForeName>S</ForeName>

<Initials>S</Initials>

<AffiliationInfo>

<Affiliation>University of Missouri at Kansas City</Affiliation>

<Identifier Source=”Ringgold”>11223344</Identifier>

</AffiliationInfo>

<AffiliationInfo>

<Affiliation>Kansas City AIDS Research Consortium</Affiliation>

</AffiliationInfo>

<AffiliationInfo>

<Affiliation>AIDS Administration Missouri</Affiliation>

<Identifier Source=”Ringgold”>66778899</Identifier>

</AffiliationInfo>

</Investigator>

</InvestigatorList>

The valid value UNLABELLED is removed from the AbstractText NlmCategory attribute because it is not used.

DTD:

<!ELEMENT AbstractText (#PCDATA)>

<!ATTLIST AbstractText

Label CDATA #IMPLIED

NlmCategory (BACKGROUND | OBJECTIVE | METHODS | RESULTS | CONCLUSIONS

UNASSIGNED) #IMPLIED>

The valid values AssociatedDataset and AssociatedPublication are added to the CommentsCorrections RefType attribute.

DTD:

<!ELEMENT CommentsCorrectionsList (CommentsCorrections+)>

<!ELEMENT CommentsCorrections (RefSource, PMID?, Note?)>

<!ATTLIST CommentsCorrections

RefType (AssociatedDataset | AssociatedPublication | CommentOn | CommentIn | ErratumIn | ErratumFor |

PartialRetractionIn | PartialRetractionOf | RepublishedFrom | RepublishedIn | RetractionOf |

RetractionIn | UpdateIn | UpdateOf | SummaryForPatientsIn | OriginalReportIn |ReprintOf | ReprintIn | Cites) #REQUIRED>

Sample XML:

PMID 24872877

<CommentsCorrectionsList>

<CommentsCorrections RefType="AssociatedPublication">

<RefSource>Gigascience. 2014 May 28;3:8. doi: 10.1186/2047-217X-3-8.
eCollection 2014.</RefSource>

<PMID Version="1">24872878</PMID>

</CommentsCorrections>

</CommentsCorrectionsList>

PMID 24872878

<CommentsCorrectionsList>

<CommentsCorrections RefType="AssociatedDataset">

<RefSource>Gigascience. 2014 May 28;3:7. doi: 10.1186/2047-217X-3-7.
eCollection 2014.</RefSource>

<PMID Version="1">24872877</PMID>

</CommentsCorrections>

</CommentsCorrectionsList>

Comment entered 2014-11-06 09:28:20 by Kline, Bob (NIH/NCI) [C]

I have modified the Citation schema to accommodate the forthcoming changes at NLM. All of the modifications are backward compatible, so we should be able to install them now, even though NLM has not yet put their changes into production.

Comment entered 2014-11-13 12:43:01 by Osei-Poku, William (NIH/NCI) [C]

Verified on DEV.

Comment entered 2014-12-19 17:29:11 by Juthe, Robin (NIH/NCI) [E]

Looks like these NLM changes have been made (according to the emails above, they were scheduled to happen around Dec 15). Christina imported several citations today and all are invalid. Thus, the only way to run a QC report for the summaries containing these citations is to use the Quick & Dirty version. And I'm guessing we can't publish a summary with invalid citations, so we'll need to get these changes promoted as soon as possible. Is it worth exploring the possibility of promoting them outside of the release to get them up sooner?

Comment entered 2014-12-19 17:39:03 by Kline, Bob (NIH/NCI) [C]

And I'm guessing we can't publish a summary with invalid citations, so we'll need to get these changes promoted as soon as possible. Is it worth exploring the possibility of promoting them outside of the release to get them up sooner?

Something we'll want to discuss with Erika (I added her as a watcher). We're about at the finish line with the appscan fixes, so the patch release shouldn't be that far down the road. Let's talk next week.

Comment entered 2014-12-19 17:44:38 by Juthe, Robin (NIH/NCI) [E]

Sounds good - thanks! (I'll be in Mon & Tues of next week)

Comment entered 2014-12-22 10:30:32 by Beckwith, Margaret (NIH/NCI) [E]

This is a high priority issue. Depending on how soon we will do the patch, and with the time involved to do an appscan, it could be a while before the fix gets promoted. We are not able to publish any new citations at this point, which is not something that we can do without for long. We are also not able to run a regular QC report on any summary that has invalid citation links in it. This is also a problem that needs to be fixed as soon as possible. Thanks.

Comment entered 2014-12-22 10:48:18 by Kline, Bob (NIH/NCI) [C]

Erika:

No one anticipated that the appscan bump in the road would drag the next patch this far beyond the original target date. Should we open an emergency ticket with CBIIT or do we need to wait for the patch?

We may want to consider building a web interface for installing schema changes. We can already do the schema change itself without CBIIT's help (the schema is just another CDR XML document, after all), but we can't recreate the DTD for XMetaL without getting CBIIT's assistance.

Comment entered 2014-12-22 11:31:51 by henryec

Since this is just a DTD/schema change, it should not need an app scan. This sounds like a critical fix because it is preventing people from doing work, so please put in the request to CBIIT to make the change on production so that this issue can be resolved.

The "mailer patch" can go separately after all the app scan issues are resolved.

Comment entered 2014-12-22 12:25:49 by Kline, Bob (NIH/NCI) [C]
Comment entered 2014-12-22 16:18:25 by Kline, Bob (NIH/NCI) [C]

I'll be working with David Do tomorrow at 10 to get this taken care of. If it drags on longer than 15 minutes I may miss part or all of the standups.

Comment entered 2014-12-22 16:28:57 by Englisch, Volker (NIH/NCI) [C]

You might have the stand-up by yourself as Alan, Erika, and myself are all out tomorrow. :-)

Comment entered 2014-12-23 10:58:27 by Kline, Bob (NIH/NCI) [C]

The schema changes (and corresponding DTD changes) have been promoted to production. Please verify.

Comment entered 2014-12-23 11:18:35 by Juthe, Robin (NIH/NCI) [E]

Thanks, Bob!

Comment entered 2014-12-23 12:17:08 by Juthe, Robin (NIH/NCI) [E]

This has been verified in production.

Elapsed: 0:00:00.001292