Issue Number | 906 |
---|---|
Summary | Documentation for filters |
Created | 2003-09-23 07:32:52 |
Issue Type | Bug |
Submitted By | Kline, Bob (NIH/NCI) [C] |
Assigned To | Gottlieb, Nanci (NIH/NCI) [E] [X] |
Status | Closed |
Resolved | 2012-01-03 11:12:59 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.105234 |
BZISSUE::905
BZDATETIME::2003-09-23 07:32:52
BZCREATOR::Bob Kline
BZASSIGNEE::Nanci Gottlieb
BZQACONTACT::Bob Kline
As discussed in a recent project status meeting, the filters in
the
CDR system need to be documented. This documentation should be
created as part of the online documentation subsystem, and should
include the purpose of each filter, where it is used, and (for
those
which produce XML rather than HTML, and for which this is not
already
documented elsewhere), the structure of the output documents.
BZDATETIME::2003-10-09 11:01:21
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::1
Please always remember this is a "Mammoth" project! I have done 15
DocTitle
filters. If you would like to take a look at them just go into XMetal on
Bach
and search for Documentation documents that start with "Filter:". All
comments
and suggestions are welcome.
BZDATETIME::2003-10-23 10:32:43
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::2
Completed CDR140-Passthrough Filter and CDR49-Protocol DoctTitle
Filter.
Finishing up on CDR135-Vendor Filter: Country. This one has a lot of
code for
stuff that is optional in the DTD and is never used so it is difficult
to
figure out what it is supposed to do (mostly the date part).
BZDATETIME::2003-10-29 15:30:00
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::3
Started documenting filter 134 -Vendor Filter: Term but it requires
filter 101
which is the Term denormalization filter. I had questions for Volker
which
made him realize that this was done the old way and needs to be
rewritten the
new way. So now I am working on filter 136 -Vendor Filter: Political
SubUnit
which also requires a denormalization filter (103). I'm not sure how
to
document the output of the denormalization filters and Volker suggested
we
might want to talk to Bob and Alan before I continue.
BZDATETIME::2003-11-05 14:11:27
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::4
Documented filter CDR103 - Denormalization Filter (1/1): Politcal
SubUnit. I
used the schema to show the output. Could someone please go to XMetal on
Bach
and take a look at it (CDR344292) and let me know if it is okay.
I then went to do Filter CDR136 - Vendor Filter: PoliticalSubUnit
(which needs
CDR103 to be run first). I could only find one example with a
<PolitcalSubUnitAlternateName>. It is CDR256193. Unfortunately the
alternate
name seems to disapear when the denormalization filter is run so it
never
makes it to the CDR136 output. Either it needs to be added to CDR103
(denorm)
or removed from CDR136 (Vendor Filter: PoliticalSubUnit). I don't know
which
one needs to be fixed.
BZDATETIME::2003-11-17 13:52:52
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::5
I am currently working on CDR137 – Vendor Filter: GlossaryTerm. It
seems to
set a variable to equal "document('cdr:/*/CdrCtl')". I am told this
refers to
a control table that stores info about all the documents. Does anyone
know
where I can learn more about this control table as it seems to be
referred to
in several filters?
Thanks
BZDATETIME::2003-11-18 14:15:02
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6
Regarding Comment #4 (CDR103):
I am wondering what the definition for "Used by" and "Filters that
require it"
would be. On a systems level these two items are probably the same but
I'm not
sure if you wanted to indicate what person or functional group would be
using
this filter.
The "Filters that require it" section is incomplete. The
ProtocolSubUnit
Denormalization filter (CDR103) is used by two filter sets:
QC PoliticalSubUnit Set
CDR0000000103:Denormalization Filter (1/1): Political SubUnit
CDR0000000118:Political SubUnit QC Report
Vendor PoliticalSubUnit Set
CDR0000000103:Denormalization Filter (1/1): Political SubUnit
CDR0000000136:Vendor Filter: PoliticalSubUnit
CDR0000315573:Vendor Filter: Final
The section of "What it outputs" is not correct. You have listed
the
PoliticalSubUnit schema that is used for the validation of the
input of the
denormalization filter.
For instance, the output of the denormalization filter contains the
CdrDocCtr as
well as the Continent elements. However, these elements are not
specified in
the schema listed.
I believe by using Bob's approach of describing the filter output we
would have
a compact notation and a good compromise between writing a DTD or schema
for the
intermediate steps (the ultimate verification) and having to review the
entire
source code of the filter creating the output.
BZDATETIME::2003-11-18 15:06:14
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::7
Regarding Comment #6 (CDR103)
First of all, please realize that I started in on this CDR project
long after
it began and NOBODY ever explained anything about it to me. I just
started
going to the meetings so I am kind of winging it.
Please refer to comment #1 when I asked others to look at the first
15 filters
I had completed. I said all comments and suggestions are welcome (back
on
October 9). Nobody gave me any comments or suggestions until November 6
when
Bob handed out printouts of a better way and I have been doing it his
way on
the filters I have completed since then. I haven’t gone back to do the
ones
already completed before then (but I will).
In your first paragraph you ask about “used by” and “filters that
require it”.
When I say “Used by:” I mean who or what uses this filter. In the case
of the
DocTitles I documented first, the filters were used by XMetal to create
the
Document titles when you save the new document. In the ones I’ve done
lately
you have told me they are used during the publishing job. When I say
“Filters
that require it:” I mean that certain filters require other filters to
be fun
first. If we don’t need the “used by” I can delete it.
The “Filters that require it" section is incomplete because I haven’t
gotten
to all the other filters yet. I was just going to go back and add them
as I
completed them.
In the “What it outputs” section I need to redo it (on CDR103) and I
will try
to understand what you wrote before I do it. But as you know, I have
no
problems asking you (Volker) to explain things to me – you do a great
job.
Today I completed filters CDR137 and CDR315573 (they are documented
in
CDR346470 and CDR346472). I’ve used Bob’s approach. Please take a look
and see
if this is better.
If I use the wrong terminology please let me know because I can
change
anything. And if you tell me why it should be changed that helps me to
learn
and that is a good thing.
I am still waiting for someone to tell me where I can learn more
about the
control table I asked about in comment #5.
I hope this all makes sense. This is my longest comment yet.
BZDATETIME::2003-11-18 15:47:28
BZCOMMENTOR::Alan Meyer
BZCOMMENT::8
I have done some research in the CdrServer code to answer the
question in
comment 5. As I understand it, "document('cdr:/*/CdrCtl')" is an XSLT
extension
function call that maps to a function inside the server that produces an
XML
document (I presume it appears as a node set to the filter that invokes
the
function) that contains control information for the document.
"document()" is standard XSLT for retrieving another document. But
how it
actually works depends on what logic a programmer (Mike Rubenstein in
our case)
supplies for finding documents. For some systems it is a directory
lookup in a
file system. For us it's a database lookup. I haven't tried to track
down all
the logic, but I presume that the "*" just indicates the current
document, so
"document('cdr:/*/CdrCtl')" means the control information for the
current document.
The control information may include the following hierarchichal fields:
DocCtl
DocValDate
DocTitle
DocComment
ReadyForReview
Create
Date
User
Modify
Date
User
FirstPub
Date
What it actually produces will depend in part on what it finds in the
record,
and in part on the parameters used by the filter module. I haven't tried
to
track those down since it may be more information than we need for this
purpose.
BZDATETIME::2003-12-03 15:29:33
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::9
I've kind of taken a break on this until I hear back about the format
I have
been using. I know Volker had raised some questions and at the last
meeting
Bob said he would take a look at them. I'd like to make sure that I am
doing
it the way you all want it done before I continue on.
Thanks,
Nanci
BZDATETIME::2003-12-03 16:20:18
BZCOMMENTOR::Bob Kline
BZCOMMENT::10
I'll take a look at this in the morning before the status meeting.
Alan and Volker: it would probably be a good idea for both of
you
to provide input on the format here, since you'll both be
consumers
of the documentation.
BZDATETIME::2003-12-04 10:27:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::11
The format of the output structure looks fine.
I have modified the documentation for the vendor final filter.
The structure has been altered somewhat, and I fleshed out the
description of what the filter does. Depending on how many
different filters pull in an included filter, it may be
appropriate
in some cases to describe what the included filter does separately
(referring to the separate documentation in the docs for the
including
filter), or embed that description of the logic in the docs for
the including filter itself (this will be most appropriate when
the portions of the included filter which get applied depend on
the
interaction with the including filter). For purposes of this
example,
I have documented everything the filter does (including the
imported
filters) in one place.
Visit
http://bach.nci.nih.gov/cgi-bin/cdr/Filter.py?DocId=CDR0000346472&Filter=name:Documentation+Help+Screens+Filter
to see the results of my modifications.
BZDATETIME::2003-12-04 10:31:33
BZCOMMENTOR::Bob Kline
BZCOMMENT::12
By the way, be careful not to mark as [string] elements which
are
containers for other elements, and which do not themselves have
any
directly text nodes. (The two elements for the term definition
should have "[string]" removed.)
BZDATETIME::2003-12-04 10:37:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::13
One more comment:
Because the vendor final filter is used by all of the document
types, it is less appropriate to include the output structure
block (it will be different for each of the document types).
In this case, the DTD itself would be the place to go for the
output structure. You would want to have the structure for the
vendor output for specific document types included in the docs
for the filters which precede the general-purpose vendor final
filter. Let's replace the GlossaryTerm elements in the Output
Structure section with a line that says "This is the final filter
for vendor output for all document types; for the output structure
details, please refer to the vendor DTD (pdq.dtd)."
BZDATETIME::2003-12-17 15:23:57
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::14
I've gone back and redid all 15 of the DocTitle filter documentation
in the
new format. I've also updated the Passthrough Filter, Vendor Filter:
Country,
and the Denorm Filter (1/1): Terminology. I still have the Denorm
Filter
(1/1): Political SubUnit, Vendor Filter: PoliticalSubUnit, Vendor
Filter:
Glossary Term, and Vendor Filter: Final to redo before I move on to new
ones.
I am also learning all about DTDs and Schemas.
BZDATETIME::2004-01-08 10:36:43
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::15
I have finished updating all the filters that I have done prviously
to the new
format. I am beginning to work on the "include" filters next.
BZDATETIME::2004-01-22 09:00:57
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::16
Update: I'm almost finished with the "include" filters. I'm actually
just
putting in the basic information and then will go back and fill out the
Output
Structure part later. Cheryl is helping out by doing the
InScopeProtocol
filters.
BZDATETIME::2004-01-29 08:33:34
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::17
Finished the include filters except for the Output Structure part.
I'm not
quite sure how that part should be done. What has been completed so far
can be
viewed at http://bach.nci.nih.gov/cgi-bin/cdr/Help.py?flavor=System.
They're
down at the bottom of the page. The one's that don't have asterisks
before
them are done/started.
BZDATETIME::2004-01-29 08:44:27
BZCOMMENTOR::Bob Kline
BZCOMMENT::18
Volker:
Could you review what Nanci has so far and provide her with
some
guidance for the output structure portion of the docs for include
filters? Thanks.
BZDATETIME::2004-01-29 09:59:08
BZCOMMENTOR::Cheryl Burg
BZCOMMENT::19
The InScopeProtocol Include filters used in the QC Reports, and the
QC Report
filters have been completed. The Output structure sections for the QC
filters
will be completed when a method for linking to sample QC report output
is
determined. I noted that the HP Summary Reports were not listed in the
Index
under QC Filters. I will discuss with Volker before beginning
documentation.
BZDATETIME::2004-01-29 15:37:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::20
I've added an entry for the Redline/Strikeout and Bold/Underline QC
reports to
the Systems TOC.
BZDATETIME::2004-02-04 14:51:43
BZCOMMENTOR::Cheryl Burg
BZCOMMENT::21
Completed documentation of Global Change filters now in production.
BZDATETIME::2004-02-12 08:05:38
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::22
I've been working on the denormalization filter documentation. I use
the
schemas to help with the Output Structure part. Starting with the
Citation
schema and now the Summary schema there have been a lot of new schema
features
(like groups and choices) that I am not quite familiar with. I have
ordered a
book all about XML Schemas. It should be here early next week.
BZDATETIME::2004-02-24 11:22:39
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::23
I'm working on the Output Structure part of the Citation
Denomalization. There
are several "Choice"s in the schema for this and I am wondering what the
best
way to display this is.
An example is:
<group name="PubDateGroup">
<choice>
<sequence>
<element name="Year" type="integer" />
choice minOccurs="0">
<sequence>
<element name="Month" type="NotEmptyString" />
<element name="Day" type="NotEmptyString" minOccurs="0" />
</sequence>
<element name="Season" type="NotEmptyString" />
</choice>
</sequence>
<element name="MedlineDate" type="NotEmptyString" />
</choice>
</group>
I beleive the possible results are:
Year
or
Year Month
or
Year Month Day
or
Year Season
or
Medline Date
The way I would normally display this is:
PubDateGroup
choice
Year [integer][1]
choice [0:1]
Month [NotEmptyString][1]
Day [NotEmptyString [0:1]
or
Season [NotEmptyString][1]
or
MedlineDate [NotEmptyString][1]
But it is hard to tell where one choice begins and another ends and
since
there is a choice between a sequence and a single element it becomes
even more
confusing.
Another way to display it could be:
PubDateGroup
choice 1
sequence 1
Year [integer][1]
choice 2 [0:1]
sequence 2
Month [NotEmptyString][1]
Day [NotEmptyString [0:1]
/sequence 2
or
Season [NotEmptyString][1]
/choice 2
/sequence 1
or
MedlineDate [NotEmptyString][1]
/choice 1
But then this is just one small group in an output full of choices
that
contain types that have even more choices so it is becoming very
difficult to
see where one choice ends and another one begins.
Does anyone have any suggestions on displaying this in a way that is
easy to
understand?
Thanks,
Nanci
BZDATETIME::2004-03-10 20:06:09
BZCOMMENTOR::Volker Englisch
BZCOMMENT::24
I spent a little time to go through some of the filters. Right now I
just
tackled the DocTitle filters.
General Comments:
=================
I would like to see all of the filters to use the same filter
name such as
"DocTitle for [DocumentType]"
Most of the filters do use this title but some do not like "DocTitle
for
Glossary Term" or "Inscope Protocol DocTitle Filter"
We will need to discuss who should be doing this change (if we decide to
go
forward with it) and which parts of the system would be
affected.
Some of the filters have not listed a single line of comments and
are not
formatted as we all would like it.
This should be fixed since it makes it easier to read and maintain
the
filters.
At the same time we could eliminate code that's commented out in
several
filters and remove the bogus <text/> tag that appears here and
there.
Well, these tags are not completely bogus but it would be better to
write
...<text>;</text>
instead of
...;<text/>
to avoid a extra space to display after the semicolon.
Again, we would need to discuss who should be doing this
change.
I would like to add the document ID of the filter to the Filter
Title.
This has the advantage - if you don't know the filter name but you have
the
title - that you can search for the ID in XMetaL.
Use the notation CDR67 instead of CDR0000000067 in the title.
Add the document ID of the filter to the filter names of the TOC
for the
same reason as above.
When creating the table the Title element is mandatory. By
leaving this
element empty the display shows two extra blank lines between the
section
title "Filter Information" and the table.
I would recomment to either drop the SectionTitle and enter the
"Filter
Information" as the Table/Title text or leave the SectionTitle and
enter
a Table/Title such as "Overview" or list the SectionTitle as "Overview"
and
the Table/Title as "Filter Information".
Where exists, I would remove the row "Other Authors" since it is
not very
meaningfull information or you'd extract the information from the CVS
log.
For the initial Author I would prefer to list the full name instead of
first
names only and list TBD instead of "?"
I believe it's safe to say that the initial author for almost all of
these
filters was Cheryl.
Some filters list a name next to the version of the filter. I
would probably
remove this or be consistend and list it everywhere.
In regard to the last comment you might want to replace the "Other
Authors"
with a row "Last Author" that would name the developer creating the
latest
version in CVS.
The country filter has the "Output Structure" as a table row. All
others list
the output structure as a section.
I'm not sure yet what the benefit of the "Description" row is but
that's
probably because the filter title and the description are almost
identical
for the DocTitle filters. I might want to come back to this later once
I
looked at some other filter types.
For now I would just like the description to be listed consistently
as:
Creates DocTitle for [Doctype] documents
or something like that and please use the document type to replace
[Doctype]
instead of repeating the typos existing in the filter title.
When you display an example I would display it as:
Example
<TT>This is the example</TT>
instead of
<TT>Ex: "This ia the example"</TT>
Specific Comments:
==================
Country)
Initial Author is CB
Move Output Structure out of the table
I'm thinking if you want to write for these trivial filters
something like
DocTitle = /Country/CountryFullName [string]
GlossaryTerm)
Filter title should be "DocTitle for GlossaryTerm"
IA is CB
DocTitle = /GlossaryTerm/TermName [string]
PoliticalSubUnit)
IA is CB
DocTitle = /PoliticalSubUnit/PoliticalSubUnitFullName [string]
Notes: Yes, the comment is wrong.
PublishingSystem)
DocTitle = /PublishingSystem/SystemName [string]
there are a lot of blank lines in the filter that could be removed.
Summary)
to 1.): The schema allows one and only one SummaryTitle. Just
because
there is a for-each loop around the element doesn't mean this element
is
multiply occurring.
Note: The text node of the title is displayed here. A title like
this:
E = m c<Superscript>2</Superscript>
would be displayed as
E = m c2
to 2.): Same as above. SummaryAudience exists exactly once per
schema.
I'm not sure if it would make more sense to display the output structure
as
Format:
SummaryTitle;SummaryType;SummaryAudience
Example:
Adrenocortical Carcinoma;Treatment;Patient
instead of
Ex: Adrenocortical Carcinoma;Treatment;Patient
1 ; 2 ; 3
1. SummaryTitle
2. SummaryType
3. SummaryAudience
For me personally it's easier to follow an example once I know the
rule
instead the other way around.
ScientificProtocolInfo)
DocTitle =
PrimaryID/IDString;OtherID/IDString[1];ProtocolTitle[@Type='Original']
output limited to 255 chars
OutOfScopeProtocol)
DocTitle =
PrimaryID/IDString;[OtherID/IDString[1]];ProtocolTitle/TitleText
output limited to 255 chars
Note: only the first OtherID element is displayed. If none exists
the
output displays to semi-colons next to each other.
The text node of the TitleText element is displayed.
Answer to your question:
Yes, the code commented out could be deleted.
Miscellanous)
MiscellaneousDocumentTitle;MiscellaneousDocumentType[;Language]
Language is displayed for Spanish documents only
InScopeProtocol)
PrimaryID/IDString;[OtherID/IDString;[OtherID/IDString;...]]\
ProtocolTitle[@Type='Professional']
Output limited to 255 chars
The OutOfScope and InScope protocol DocTitle filters should probably
be
combined or use a similar format.
Citation)
Remove "Other Authors" row
DocTitle =
PDQCitation/CitationTitle|PubmedArticle/../ArticleTitle
output limited to 255 chars
only the text node is displayed
Documentation)
Remove the bulleted list and use an ordered list instead
DocTitle =
Body/DocumentationTitle;MetaData/Function;MetaData/DocType[1];\
Documentation[@InfoType]
DocumentationToC)
DocTitle = DocumentationToC/ToCTitle;DocumentationToC/@Use
Term)
DocTitle =
PreferredName;TermType/TermTypeName;[TermType/TermTypeName;[...]]\
[SemanticType[@cdr:ref]]
with
SemanticType = TerminologyLink to /Term/PreferredName
Organization)
DocTitle =
[Status/CurrentStatus;]OfficialName/Name;[ShortName/Name;]\
City;[CitySuffix;]\
Country[@cdr:ref]|PoliticalSubUnit_State[@cdr:ref]
with
CurrentStatus = displayed only if value = 'Inactive'
Country = CountryLink to /Country/CountryFullName
PoliticalSubUnit_State =
StateLink to /PoliticalSubUnit/PoliticalSubUnitFullName
Sorry, but time's up.
I'll enter my comments/suggestions for Mailer, Person and CTGovProtocol
tomorrow.
BZDATETIME::2004-03-15 16:20:37
BZCOMMENTOR::Volker Englisch
BZCOMMENT::25
Specific Comments (continued):
==============================
CTGovProtocol)
DocTitle =
OrgStudyID[; SecondaryID[;
SecondaryID[;...]]];OfficialTitle|BriefTitle
Output limited to 255 chars
Mailer)
DocTitle =
No recipient|Organization/../OfficialName/Name|\
Person/../SurName, Person/../GivenName [SendDate] [Mailer/Type]
Display "No recipient" if the cdr:ref of the Recipient points to
a
non-existing document or
Display the Organization Name or
Display the Person Name if the Organization Name can not be
found
Person)
This one is indeed a little tricky to display on a single line what
is
actually happening in the background.
DocTitle = [Inactive;]Person/../SurName, GivenName;City;\
PoliticalSubUnit_State|Country
City and State/Country are selected from that office location
that's
matching the CIPSContact element which is either
Home|PrivatePracticeLocation|OtherPracticeLocation/SpecificPostalAddress|\
OtherPracticeLocation/Organization/OrganizationLocations/\
OrganizationLocation/Location
The status flag (i.e. Inactive) is only displayed for inactive
documents.
BZDATETIME::2004-03-15 17:18:26
BZCOMMENTOR::Volker Englisch
BZCOMMENT::26
Comment to "Denorm Filter (1/1): Terminology"
Please see the following corrected Output Structure (modifications
are marked
with "***").
Term
DocId [string] [1]
CdrDocCtl
DocValStatus [string]
DocTitle [string]
DocComment [string]
ReadyForReview [string]
Create
Date [date]
User [string]
Modify
Date [date]
User [string]
PreferredName [string] [1]
OtherName [0:many]
OtherTermName [string] [1]
OtherNameType [string] [1:many]
SourceInformation [string] [0:1]
ReiviewStatus [string] [0:1]
Comment [string] [0:1]
Definition [0:many]
DefinitionText [string] [0:many]
DefinitionType [string] [1]
DefinitionSource [0:1]
DefinitionSourceName [string] [1]
DefinedTermId [string] [0:1]
Comment [string] [0:1]
TermType [1:many]
TermTypeName [string] [0:1]
SemanticType [0:many]
Term [1]
@cdr:ref [string - CDR[0-9]{10}] [1]
@PdqKey [string] [0:1]
PreferredName
SemanticTypeText [string] [0:many]
TermRelationship [1:many]
ParentTerm [1] *** TermId [1] *** Term [1] *** PreferredName [1] *** PdqKey [0:1] *** |
RelatedTerm [1] *** TermId [1] *** Term [1] *** @cdr:ref [string - CDR[0-9]{10}] [1] *** @PdqKey [string] [0:1] *** PreferredName [1] *** PdqKey [0:1] *** RelationshipType [string] [1] *** Comment [string] [0:1] *** TermStatus [string] [1] *** Comment [string [0:1] |
BZDATETIME::2004-03-18 12:11:57
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::27
I've added the "Filter Information" table for most of the filters.
I'm now
making sure they are all in the "System Information" list at
http://bach.nci.nih.gov/cgi-bin/cdr/Help.py?flavor=System.
I will then go back
and add the "Output Structure" and "Processing Description" for each
one.
FEAR THE TURTLE!
BZDATETIME::2004-03-25 09:58:36
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::28
I've made the changes to the documentation that Volker suggested in
comments
#24, #25, and #26 (DocTitle filter documentation and the Term Denorm
filter
documentation). I will go through and make the same changes to the rest
of the
documentation. Volker, have you had time to think about how to best
display
all the many "choices" in the Output Structure? Take a look at the
Citation
Denormalization schema for lots of examples of choices.
BZDATETIME::2004-04-01 12:00:49
BZCOMMENTOR::Volker Englisch
BZCOMMENT::29
Filter: Denormalization Filter (1/1): Political SubUnit
The only thing I would have to mention here is that the Country element
is
listed as having the single cdr:ref attribute but in fact the
denormalization
filter carries over every attribute that exists.
However, since the PdqKey is depricated the output structure is correct
for the
time being
What is the meaning for the entry "Filters that require it"?
You have listed 3 filters that require❓ the PoliticalSubUnit
denormalization
filter. Do you mean that these depend on the output from it?
If so I would think that this is redundant since you've already listed
the
filter sets that are using this filter. I would think the "Filters
that
require it" would make more sense for global modules since these are
not
directly listed in a filter set.
Filter: Denormalization Filter: Summary
Your note says
> Also waiting to hear from Volker about how to display all the many
choices.
Could you please remind me of what you are waiting for since I
don't
remember?
Filter: Module: Citation Denormalization
Your description is a little misleading. The description says:
Denormalizes Citations beyond what the Citation Denormalization filter
does
but in fact the Citation Denormalization filter does not do any
denormalization. All it does is to call the denormalization module which
is
doing all of the denormalization.
This was done to allow other filters (protocol and summary) to use
the
same citation denormalization.
I don't remember what we decided to do: Move the entire data
structure from
the Citation denormalization filter to the denorm filter and refer to it
or
display (and possibly repeat) the structure in the denorm filter as you
have
done it.
Either way, all that the denormalization filter is doing is to add
the DocID
and the cdrctrl information and then hands of everything else to
the
citation denorm module.
BZDATETIME::2004-04-05 08:01:39
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::30
Volker,
To answer your questions plus I have added a few more questions of my own:
Filter: Denormalization Filter (1/1): Political SubUnit
I added the PdqKey attribute. I’ve done the output structure based on
the
schema and an example. However, for “Country” it just says it is
type “countrylink”. In the schema it just has the two attributes
for
countrylink. There is no mention of the
elements: “CountryFullName”,” “Continent” or “PostalCodePosition”. They
don’t
appear in the document when the pass through filter is run. There is a
line in
the filter to get the “CountryFullName” but no mention of the
“Continent”
or “PostalCodePostion”. How does this all work?
What do you mean by “the PdqKey is deprecated”?
I’ve put “Filters that require it” in there to list all the filters
that
require it to be run first. I thought this would be helpful to see
which
filters might be affected if a change is made. I do add an “Includes:”
section
if the filter includes another filter.
Filter: Denormalization Filter: Summary
I gave an example of my question about how to best display my
“choices”
delemma in comment #23. However, when you look at this look at the
Citation
schema because it has a lot more choices.
Filter: Module: Citation Denormalization
I write my descriptions based on the titles of the filters. I have no
idea
what any of the filters do. I had no idea why there is a Citation
denormalization filter AND a denormalization module. Do you have a
suggestion
for a better description for this?
BZDATETIME::2004-04-05 16:13:40
BZCOMMENTOR::Volker Englisch
BZCOMMENT::31
> Filter: Denormalization Filter (1/1): Political SubUnit
> However, for “Country” it just says it is type “countrylink”. In
the schema
> it just has the two attributes for countrylink.
You can find out what the CountryLink resolves to by looking at
the
CDRCommonSchema and the CountrySchema.
> They don’t appear in the document when the pass through filter is run.
That is correct because the passthrough filter only displayes the
document as
is. No denormalization is done here.
The denormalization filters are doing almost the same - displaying the
document
content as is - with the major exception that whenever an element
SomethingLINK
is encountered to pull in that document the link is pointing to at
this
location.
You encounter an OrganizationLink element--> include the Organization
document
here; you encounter a PersonLink element --> include the Person
document
here; ...
> What do you mean by “the PdqKey is deprecated”?
It is not used anymore to identify a document. The PDQKey used to be
the
unique identifier of a document in the old system. This is now replaced
by the
CDRID. Only legacy documents are still displaying this attribute.
Those
documents created within the CDR (that did not get converted from the
PDQ
system) won't display this attribute.
> - Filter: Denormalization Filter: Summary
> I gave an example of my question about how to best display my
“choices”
I don't have a problem with the first format you're offering. As long
as the
indentation is correct there shouldn't be a problem with the
format.
I'm not able to follow the second display format especially when you
assume
that the content of the first and second format is identical.
I'd go with the first option.
> - Filter: Module: Citation Denormalization
> I write my descriptions based on the titles of the filters. I have
no idea
> what any of the filters do.
I thought that you'd know after you've documented the filters.
:-)
As always, please feel free to ask me if you're not sure what the
functionallity of a filter is.
> Do you have a suggestion for a better description for this?
I have no problem listing both filters as
"Denormalizes Citations"
or maybe for one (Module that Denormalizes Citations) but adding
"beyond what the Citation Denormalization filter does"
implies that there is more denormalization happening in the denorm
filter which
is not correct.
BZDATETIME::2004-04-08 12:11:24
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::32
Hi Guys,
I'm taking a break from the filter documentation and am in the
process of
QCing the new NCI web site. I will be back documenting filters during
the
lulls in the QC process. I will miss today's meeting. Have a good
one!
Nanci
BZDATETIME::2004-04-15 11:11:51
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::33
We are having a lull in the new NCI Website QC process so I am
working on the
Citation Denormalization Filter Documentation. The Output structure
is
incredibly ugly. you can take a look at http://bach.nci.nih.gov/cgi-
bin/cdr/Filter.py?
DocId=CDR0000355386&Filter=name:Documentation+Help+Screens+Filter if
you'd
like to see it. I'm doing it mostly by the schema because there are so
many
choices I can't find any examples. There are some types that are so long
and
used so many times and some that are called from inside themselves that
I have
separated them from the main output structure and put them at the end
and have
pointed to them with asterisks (from * to 8* so far). I will continue on
until
there is more to QC. See ya later ... maybe.
BZDATETIME::2004-04-23 12:26:08
BZCOMMENTOR::Volker Englisch
BZCOMMENT::34
Regarding comment #19:
I have created a directory structure that allows us to create links in
the
documentation documents to display the QC Report Output as samples for
the
output structure.
Within the document use the ExternalRef element (within a Para
element) and set
the attribute to
http://bach/nci.nih.gov/cdr/Documentation/QC_Output_Format/CDR12345.html
with 12345 being the documentation document ID.
The QC Report Output itself will have to be saved on BACH under
d:\InetPub\wwwroot\cdr\Documentation\QC_Output_Format\CDR12345.html
BZDATETIME::2004-05-12 10:56:25
BZCOMMENTOR::Cheryl Burg
BZCOMMENT::35
Links to sample output have been added under Output Structure for all
QC Report
Filters and Include Filters which generate sections of the
InScopeProtocol QC
Reports. In addition sample output has been added for Documentation
Help
Screens and the Table Formatter Include filter.
BZDATETIME::2004-06-03 11:43:23
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::36
I'm back! I've noticed that there are some new filters since I last
worked on
this so I will begin with them. I finished the documentation for
"DocTitle for
PDQBoardMemberInfo" and have started working on documenting the
denormalization filter for PDQBoardMemberInfo.
BZDATETIME::2004-06-17 11:57:41
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::37
No progress to report. Cheryl and I are working on a special project
for Sue
(for Gisele) so we will be unable to attend today's meeting.
BZDATETIME::2004-07-01 08:20:58
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::38
Status of Filter Documentation
I’ve pretty much have them all started. I have a documentation
document for
each filter. They can be viewed at http://bach.nci.nih.gov/cgi-bin/cdr/Help.py?
flavor=System. The one’s with asterisks in front have not been done
except
for “Copy XML for Organization”, CDR0000315594 (Documentation -
CDR0000360876)
and “Revision Markup Filter” , CDR0000000093 (Documentation -
CDR0000362050).
I tried to go in and add them to the System Information” document
(CDR0000256207) but Volker has had it checked out the last couple of
days.
The following documents still need to be documented completely:
Denormalization Filter: PDQBoardMemberInfo - CDR0000000129
Denormalization of Contact Information for Person Orgs and
Protocols
Module: Emailer Common
Module: General Markup Formatter
Copy XML for Citation QC Report
Copy XML for InScopeProtocol
Copy XML for PDQBoardMemberInfo
Copy XML for Person - QC Report
Copy XML for Person QC Report
Only the first one is on Bach which is why I probably haven’t started them yet.
All the rest have the Filter Information section completed, some have
Output
Structure section completed, some do not. Some have Processing
Description
section completed, some do not.
I’m currently working on getting the statistics project up and
running. I’m
not sure when I will be able to get back to the documentation. It might
be a
good idea for whoever wrote each filter to complete the documentation
for that
filter since they best understand what it is the filter does.
Thanks,
Nanci
BZDATETIME::2004-07-08 14:10:04
BZCOMMENTOR::Bob Kline
BZCOMMENT::39
Dropped priority for now.
BZDATETIME::2012-01-03 11:08:41
BZCOMMENTOR::Volker Englisch
BZCOMMENT::40
(In reply to comment #39)
> Dropped priority for now.
Bob, I think this issue should get closed or at least reassigned. It's been a P10 for a long, long time.
BZDATETIME::2012-01-03 11:12:47
BZCOMMENTOR::Bob Kline
BZCOMMENT::41
We've done as much as we're going to do for now.
BZDATETIME::2012-01-03 11:12:59
BZCOMMENTOR::Bob Kline
BZCOMMENT::42
Closing.
Elapsed: 0:00:00.001274