Issue Number | 3825 |
---|---|
Summary | Annual NLM DTD changes that affect citations import |
Created | 2014-11-05 17:44:04 |
Issue Type | New Feature |
Submitted By | Osei-Poku, William (NIH/NCI) [C] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2014-11-06 09:28:20 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.141236 |
I have posted below, email exchanges we had regarding NLM's annual DTD changes.
______________________________
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient Government Services
NCI: 240-276-6583
------Original Message
From: Osei-Poku, William William.Osei-Poku@icfi.com
Sent: Tuesday, November 04, 2014 2:07 PM
To: Juthe, Robin (NIH/NCI) [E]; Englisch, Volker (NIH/NCI) [C]; Kline,
Robert (NCI); Beckwith, Margaret (NIH/NCI) [E]; Alan Meyer; Henry, Erika
(NIH/NCI) [C]
Subject: RE: [Utilities-announce] PubMed E-Utilities 2015 DTD
updates
I agree that it should go in the hotfix. In the past we were able to import citations but they remained invalid and unable to use them until our schema was updated.
Thanks,
William
------Original Message
From: Juthe, Robin (NIH/NCI) [E] robin.juthe@nih.gov
Sent: Tuesday, November 04, 2014 1:30 PM
To: Englisch, Volker (NIH/NCI) [C]; Kline, Robert (NCI); Beckwith,
Margaret (NIH/NCI) [E]; Osei-Poku, William; Alan Meyer; Henry, Erika
(NIH/NCI) [C]
Subject: RE: [Utilities-announce] PubMed E-Utilities 2015 DTD
updates
I think this should go into the hotfix if possible. In the meantime, we run the risk of importing duplicate citations or not being able to import citations at all, right?
Thanks,
Robin
------Original Message
From: Englisch, Volker (NIH/NCI) [C]
Sent: Tuesday, November 04, 2014 1:27 PM
To: Kline, Robert (NCI); Beckwith, Margaret (NIH/NCI) [E]; Juthe, Robin
(NIH/NCI) [E]; Osei-Poku, William; Alan Meyer; Henry, Erika (NIH/NCI)
[C]
Subject: RE: [Utilities-announce] PubMed E-Utilities 2015 DTD
updates
Do we need to hold up the Mailer hot-fix to include the citation schema hot-fix? I don't think we would be able to launch another hot-fix before December.
Thanks,
Volker
___________________________________
Volker Englisch
NCI OCPL – Office of Communications & Public Liaison
Contractor: Sapient Government Services
NCI: 240-276-6583
------Original Message
From: Kline, Robert (NCI)
Sent: Friday, October 31, 2014 12:17 PM
To: Beckwith, Margaret (NIH/NCI) [E]; Juthe, Robin (NIH/NCI) [E];
Osei-Poku, William; Alan Meyer; Englisch, Volker (NIH/NCI) [C]; Henry,
Erika (NIH/NCI) [C]
Subject: Fwd: [Utilities-announce] PubMed E-Utilities 2015 DTD
updates
Just a heads-up that we may need to update our Citation schema in the CDR some time before December. Assuming no CDR release or patch is scheduled between now and then, this will require a ticket to CBIIT to get some of the mods installed on production.
Bob
Forwarded message ----------
From: <utilities-announce@ncbi.nlm.nih.gov>
Date: Fri, Oct 31, 2014 at 11:50 AM
Subject: [Utilities-announce] PubMed E-Utilities 2015 DTD updates
To: NLM/NCBI List utilities-announce
<utilities-announce@ncbi.nlm.nih.gov>
Dear NCBI PubMed E-Utilities Users,
We anticipate updating the PubMed E-Utilities DTDs for 2015 in mid-December, approximately on December 15, 2014.
The forthcoming DTDs are now available:
http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/bookdoc_150101.dtd
http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/nlmmedlinecitationset_150101.dtd
http://eutils.ncbi.nlm.nih.gov/entrez/query/DTD/pubmed_150101.dtd
The DTD changes for the 2015 production year are itemized in the Revision Notes section near the top of the DTDs. The following describes the substantive changes:
Add new UI attribute to DescriptorName, QualifierName, NameOfSubstance, SupplMeshName and PublicationType elements. The new required UI attribute will carry the MeSH unique identifier for DescriptorName, QualifierName, NameOfSubstance, SupplMeshName and PublicationType elements.
DTD:
<!ELEMENT DescriptorName (#PCDATA)>
<!ATTLIST DescriptorName
MajorTopicYN (Y | N) "N"
Type (Geographic) #IMPLIED
UI CDATA #REQUIRED>
<!ELEMENT QualifierName (#PCDATA)>
<!ATTLIST QualifierName
MajorTopicYN (Y | N) "N"
UI CDATA #REQUIRED>
<!ELEMENT NameOfSubstance (#PCDATA)>
<!ATTLIST NameOfSubstance
UI CDATA #REQUIRED>
<!ELEMENT SupplMeshName (#PCDATA)>
<!ATTLIST SupplMeshName
Type (Disease | Protocol) #REQUIRED
UI CDATA #REQUIRED>
<!ELEMENT PublicationTypeList (PublicationType+)>
<!ELEMENT PublicationType (#PCDATA)>
<!ATTLIST PublicationType
UI CDATA #REQUIRED>
Sample XML:
<DescriptorName MajorTopicYN="N" UI="D054971">Orthostatic Intolerance</DescriptorName>
<QualifierName MajorTopicYN="N" UI="Q000628">therapy</QualifierName>
<NameOfSubstance UI="C058787">royal jelly</NameOfSubstance>
<SupplMeshName Type="Disease"
UI="C537735">Oculofaciocardiodental
syndrome</SupplMeshName>
<PublicationType UI=”D016428”>Journal Article</PublicationType>
Add new optional and repeatable envelope element AffiliationInfo to Author and Investigator elements. AffiliationInfo envelope element includes Affliliation and Identifier elements.
DTD:
<!ELEMENT Author (((LastName, ForeName?, Initials?, Suffix?)
|
CollectiveName), Identifier*,
AffiliationInfo*)>
<!ATTLIST Author ValidYN (Y | N) "Y">
<!ELEMENT Investigator (LastName, ForeName?, Initials?,
Suffix?,
Identifier*, AffiliationInfo*)>
<!ATTLIST Investigator ValidYN (Y | N) "Y">
<!ELEMENT AffiliationInfo(Affiliation, Identifier*)>
<!ELEMENT Affiliation (#PCDATA)>
<!ELEMENT Identifier (#PCDATA)>
<!ATTLIST Identifier
Source CDATA #REQUIRED)>
Sample XML:
<AuthorList CompleteYN="Y">
<Author ValidYN="Y">
<LastName>Rome</LastName>
<ForeName>Benjamin N</ForeName>
<Initials>BN</Initials>
<Identifier Source="ORCID">0000000111111111</Identifier>
<AffiliationInfo>
<Affiliation>Harvard Medical School, Boston, Massachusetts</Affiliation>
<Identifier Source=”Ringgold”>123456</Identifier>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>Program on Regulation, Therapeutics, and Law, Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital, Boston, Massachusetts</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>Beth Israel Deaconess Medical</Affiliation>
<Identifier Source=”Ringgold”>678922</Identifier>
</AffiliationInfo>
</Author>
</AuthorList>
<InvestigatorList>
<Investigator ValidYN="Y">
<LastName>Salloway</LastName>
<ForeName>S</ForeName>
<Initials>S</Initials>
<AffiliationInfo>
<Affiliation>University of Missouri at Kansas City</Affiliation>
<Identifier Source=”Ringgold”>11223344</Identifier>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>Kansas City AIDS Research Consortium</Affiliation>
</AffiliationInfo>
<AffiliationInfo>
<Affiliation>AIDS Administration Missouri</Affiliation>
<Identifier Source=”Ringgold”>66778899</Identifier>
</AffiliationInfo>
</Investigator>
</InvestigatorList>
The valid value UNLABELLED is removed from the AbstractText NlmCategory attribute because it is not used.
DTD:
<!ELEMENT AbstractText (#PCDATA)>
<!ATTLIST AbstractText
Label CDATA #IMPLIED
NlmCategory (BACKGROUND | OBJECTIVE | METHODS | RESULTS | CONCLUSIONS
UNASSIGNED) #IMPLIED> |
The valid values AssociatedDataset and AssociatedPublication are added to the CommentsCorrections RefType attribute.
DTD:
<!ELEMENT CommentsCorrectionsList (CommentsCorrections+)>
<!ELEMENT CommentsCorrections (RefSource, PMID?, Note?)>
<!ATTLIST CommentsCorrections
RefType (AssociatedDataset | AssociatedPublication | CommentOn | CommentIn | ErratumIn | ErratumFor |
PartialRetractionIn | PartialRetractionOf | RepublishedFrom | RepublishedIn | RetractionOf |
RetractionIn | UpdateIn | UpdateOf | SummaryForPatientsIn | OriginalReportIn |ReprintOf | ReprintIn | Cites) #REQUIRED>
Sample XML:
PMID 24872877
<CommentsCorrectionsList>
<CommentsCorrections RefType="AssociatedPublication">
<RefSource>Gigascience. 2014 May 28;3:8. doi:
10.1186/2047-217X-3-8.
eCollection 2014.</RefSource>
<PMID Version="1">24872878</PMID>
</CommentsCorrections>
</CommentsCorrectionsList>
PMID 24872878
<CommentsCorrectionsList>
<CommentsCorrections RefType="AssociatedDataset">
<RefSource>Gigascience. 2014 May 28;3:7. doi:
10.1186/2047-217X-3-7.
eCollection 2014.</RefSource>
<PMID Version="1">24872877</PMID>
</CommentsCorrections>
</CommentsCorrectionsList>
I have modified the Citation schema to accommodate the forthcoming changes at NLM. All of the modifications are backward compatible, so we should be able to install them now, even though NLM has not yet put their changes into production.
Verified on DEV.
Looks like these NLM changes have been made (according to the emails above, they were scheduled to happen around Dec 15). Christina imported several citations today and all are invalid. Thus, the only way to run a QC report for the summaries containing these citations is to use the Quick & Dirty version. And I'm guessing we can't publish a summary with invalid citations, so we'll need to get these changes promoted as soon as possible. Is it worth exploring the possibility of promoting them outside of the release to get them up sooner?
And I'm guessing we can't publish a summary with invalid citations, so we'll need to get these changes promoted as soon as possible. Is it worth exploring the possibility of promoting them outside of the release to get them up sooner?
Something we'll want to discuss with Erika (I added her as a watcher). We're about at the finish line with the appscan fixes, so the patch release shouldn't be that far down the road. Let's talk next week.
Sounds good - thanks! (I'll be in Mon & Tues of next week)
This is a high priority issue. Depending on how soon we will do the patch, and with the time involved to do an appscan, it could be a while before the fix gets promoted. We are not able to publish any new citations at this point, which is not something that we can do without for long. We are also not able to run a regular QC report on any summary that has invalid citation links in it. This is also a problem that needs to be fixed as soon as possible. Thanks.
Erika:
No one anticipated that the appscan bump in the road would drag the next patch this far beyond the original target date. Should we open an emergency ticket with CBIIT or do we need to wait for the patch?
We may want to consider building a web interface for installing schema changes. We can already do the schema change itself without CBIIT's help (the schema is just another CDR XML document, after all), but we can't recreate the DTD for XMetaL without getting CBIIT's assistance.
Since this is just a DTD/schema change, it should not need an app scan. This sounds like a critical fix because it is preventing people from doing work, so please put in the request to CBIIT to make the change on production so that this issue can be resolved.
The "mailer patch" can go separately after all the app scan issues are resolved.
Webteam ticket entered:
I'll be working with David Do tomorrow at 10 to get this taken care of. If it drags on longer than 15 minutes I may miss part or all of the standups.
You might have the stand-up by yourself as Alan, Erika, and myself are all out tomorrow. :-)
The schema changes (and corresponding DTD changes) have been promoted to production. Please verify.
Thanks, Bob!
This has been verified in production.
Elapsed: 0:00:00.001292