CDR Tickets

Issue Number 906
Summary Documentation for filters
Created 2003-09-23 07:32:52
Issue Type Bug
Submitted By Kline, Bob (NIH/NCI) [C]
Assigned To Gottlieb, Nanci (NIH/NCI) [E] [X]
Status Closed
Resolved 2012-01-03 11:12:59
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.105234
Description

BZISSUE::905
BZDATETIME::2003-09-23 07:32:52
BZCREATOR::Bob Kline
BZASSIGNEE::Nanci Gottlieb
BZQACONTACT::Bob Kline

As discussed in a recent project status meeting, the filters in the
CDR system need to be documented. This documentation should be
created as part of the online documentation subsystem, and should
include the purpose of each filter, where it is used, and (for those
which produce XML rather than HTML, and for which this is not already
documented elsewhere), the structure of the output documents.

Comment entered 2003-10-09 11:01:21 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-10-09 11:01:21
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::1

Please always remember this is a "Mammoth" project! I have done 15 DocTitle
filters. If you would like to take a look at them just go into XMetal on Bach
and search for Documentation documents that start with "Filter:". All comments
and suggestions are welcome.

Comment entered 2003-10-23 10:32:43 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-10-23 10:32:43
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::2

Completed CDR140-Passthrough Filter and CDR49-Protocol DoctTitle Filter.
Finishing up on CDR135-Vendor Filter: Country. This one has a lot of code for
stuff that is optional in the DTD and is never used so it is difficult to
figure out what it is supposed to do (mostly the date part).

Comment entered 2003-10-29 15:30:00 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-10-29 15:30:00
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::3

Started documenting filter 134 -Vendor Filter: Term but it requires filter 101
which is the Term denormalization filter. I had questions for Volker which
made him realize that this was done the old way and needs to be rewritten the
new way. So now I am working on filter 136 -Vendor Filter: Political SubUnit
which also requires a denormalization filter (103). I'm not sure how to
document the output of the denormalization filters and Volker suggested we
might want to talk to Bob and Alan before I continue.

Comment entered 2003-11-05 14:11:27 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-11-05 14:11:27
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::4

Documented filter CDR103 - Denormalization Filter (1/1): Politcal SubUnit. I
used the schema to show the output. Could someone please go to XMetal on Bach
and take a look at it (CDR344292) and let me know if it is okay.

I then went to do Filter CDR136 - Vendor Filter: PoliticalSubUnit (which needs
CDR103 to be run first). I could only find one example with a
<PolitcalSubUnitAlternateName>. It is CDR256193. Unfortunately the alternate
name seems to disapear when the denormalization filter is run so it never
makes it to the CDR136 output. Either it needs to be added to CDR103 (denorm)
or removed from CDR136 (Vendor Filter: PoliticalSubUnit). I don't know which
one needs to be fixed.

Comment entered 2003-11-17 13:52:52 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-11-17 13:52:52
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::5

I am currently working on CDR137 – Vendor Filter: GlossaryTerm. It seems to
set a variable to equal "document('cdr:/*/CdrCtl')". I am told this refers to
a control table that stores info about all the documents. Does anyone know
where I can learn more about this control table as it seems to be referred to
in several filters?

Thanks

Comment entered 2003-11-18 14:15:02 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2003-11-18 14:15:02
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6

Regarding Comment #4 (CDR103):
I am wondering what the definition for "Used by" and "Filters that require it"
would be. On a systems level these two items are probably the same but I'm not
sure if you wanted to indicate what person or functional group would be using
this filter.

The "Filters that require it" section is incomplete. The ProtocolSubUnit
Denormalization filter (CDR103) is used by two filter sets:

  • QC PoliticalSubUnit Set
    CDR0000000103:Denormalization Filter (1/1): Political SubUnit
    CDR0000000118:Political SubUnit QC Report

  • Vendor PoliticalSubUnit Set
    CDR0000000103:Denormalization Filter (1/1): Political SubUnit
    CDR0000000136:Vendor Filter: PoliticalSubUnit
    CDR0000315573:Vendor Filter: Final

The section of "What it outputs" is not correct. You have listed the
PoliticalSubUnit schema that is used for the validation of the input of the
denormalization filter.
For instance, the output of the denormalization filter contains the CdrDocCtr as
well as the Continent elements. However, these elements are not specified in
the schema listed.

I believe by using Bob's approach of describing the filter output we would have
a compact notation and a good compromise between writing a DTD or schema for the
intermediate steps (the ultimate verification) and having to review the entire
source code of the filter creating the output.

Comment entered 2003-11-18 15:06:14 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-11-18 15:06:14
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::7

Regarding Comment #6 (CDR103)

First of all, please realize that I started in on this CDR project long after
it began and NOBODY ever explained anything about it to me. I just started
going to the meetings so I am kind of winging it.

Please refer to comment #1 when I asked others to look at the first 15 filters
I had completed. I said all comments and suggestions are welcome (back on
October 9). Nobody gave me any comments or suggestions until November 6 when
Bob handed out printouts of a better way and I have been doing it his way on
the filters I have completed since then. I haven’t gone back to do the ones
already completed before then (but I will).

In your first paragraph you ask about “used by” and “filters that require it”.
When I say “Used by:” I mean who or what uses this filter. In the case of the
DocTitles I documented first, the filters were used by XMetal to create the
Document titles when you save the new document. In the ones I’ve done lately
you have told me they are used during the publishing job. When I say “Filters
that require it:” I mean that certain filters require other filters to be fun
first. If we don’t need the “used by” I can delete it.

The “Filters that require it" section is incomplete because I haven’t gotten
to all the other filters yet. I was just going to go back and add them as I
completed them.

In the “What it outputs” section I need to redo it (on CDR103) and I will try
to understand what you wrote before I do it. But as you know, I have no
problems asking you (Volker) to explain things to me – you do a great job.

Today I completed filters CDR137 and CDR315573 (they are documented in
CDR346470 and CDR346472). I’ve used Bob’s approach. Please take a look and see
if this is better.

If I use the wrong terminology please let me know because I can change
anything. And if you tell me why it should be changed that helps me to learn
and that is a good thing.

I am still waiting for someone to tell me where I can learn more about the
control table I asked about in comment #5.

I hope this all makes sense. This is my longest comment yet.

Comment entered 2003-11-18 15:47:28 by alan

BZDATETIME::2003-11-18 15:47:28
BZCOMMENTOR::Alan Meyer
BZCOMMENT::8

I have done some research in the CdrServer code to answer the question in
comment 5. As I understand it, "document('cdr:/*/CdrCtl')" is an XSLT extension
function call that maps to a function inside the server that produces an XML
document (I presume it appears as a node set to the filter that invokes the
function) that contains control information for the document.

"document()" is standard XSLT for retrieving another document. But how it
actually works depends on what logic a programmer (Mike Rubenstein in our case)
supplies for finding documents. For some systems it is a directory lookup in a
file system. For us it's a database lookup. I haven't tried to track down all
the logic, but I presume that the "*" just indicates the current document, so
"document('cdr:/*/CdrCtl')" means the control information for the current document.

The control information may include the following hierarchichal fields:

DocCtl
DocValDate
DocTitle
DocComment
ReadyForReview
Create
Date
User
Modify
Date
User
FirstPub
Date

What it actually produces will depend in part on what it finds in the record,
and in part on the parameters used by the filter module. I haven't tried to
track those down since it may be more information than we need for this purpose.

Comment entered 2003-12-03 15:29:33 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-12-03 15:29:33
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::9

I've kind of taken a break on this until I hear back about the format I have
been using. I know Volker had raised some questions and at the last meeting
Bob said he would take a look at them. I'd like to make sure that I am doing
it the way you all want it done before I continue on.

Thanks,
Nanci

Comment entered 2003-12-03 16:20:18 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2003-12-03 16:20:18
BZCOMMENTOR::Bob Kline
BZCOMMENT::10

I'll take a look at this in the morning before the status meeting.

Alan and Volker: it would probably be a good idea for both of you
to provide input on the format here, since you'll both be consumers
of the documentation.

Comment entered 2003-12-04 10:27:10 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2003-12-04 10:27:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::11

The format of the output structure looks fine.

I have modified the documentation for the vendor final filter.
The structure has been altered somewhat, and I fleshed out the
description of what the filter does. Depending on how many
different filters pull in an included filter, it may be appropriate
in some cases to describe what the included filter does separately
(referring to the separate documentation in the docs for the including
filter), or embed that description of the logic in the docs for
the including filter itself (this will be most appropriate when
the portions of the included filter which get applied depend on the
interaction with the including filter). For purposes of this example,
I have documented everything the filter does (including the imported
filters) in one place.

Visit
http://bach.nci.nih.gov/cgi-bin/cdr/Filter.py?DocId=CDR0000346472&Filter=name:Documentation+Help+Screens+Filter
to see the results of my modifications.

Comment entered 2003-12-04 10:31:33 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2003-12-04 10:31:33
BZCOMMENTOR::Bob Kline
BZCOMMENT::12

By the way, be careful not to mark as [string] elements which are
containers for other elements, and which do not themselves have any
directly text nodes. (The two elements for the term definition
should have "[string]" removed.)

Comment entered 2003-12-04 10:37:10 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2003-12-04 10:37:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::13

One more comment:

Because the vendor final filter is used by all of the document
types, it is less appropriate to include the output structure
block (it will be different for each of the document types).
In this case, the DTD itself would be the place to go for the
output structure. You would want to have the structure for the
vendor output for specific document types included in the docs
for the filters which precede the general-purpose vendor final
filter. Let's replace the GlossaryTerm elements in the Output
Structure section with a line that says "This is the final filter
for vendor output for all document types; for the output structure
details, please refer to the vendor DTD (pdq.dtd)."

Comment entered 2003-12-17 15:23:57 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2003-12-17 15:23:57
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::14

I've gone back and redid all 15 of the DocTitle filter documentation in the
new format. I've also updated the Passthrough Filter, Vendor Filter: Country,
and the Denorm Filter (1/1): Terminology. I still have the Denorm Filter
(1/1): Political SubUnit, Vendor Filter: PoliticalSubUnit, Vendor Filter:
Glossary Term, and Vendor Filter: Final to redo before I move on to new ones.
I am also learning all about DTDs and Schemas.

Comment entered 2004-01-08 10:36:43 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-01-08 10:36:43
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::15

I have finished updating all the filters that I have done prviously to the new
format. I am beginning to work on the "include" filters next.

Comment entered 2004-01-22 09:00:57 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-01-22 09:00:57
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::16

Update: I'm almost finished with the "include" filters. I'm actually just
putting in the basic information and then will go back and fill out the Output
Structure part later. Cheryl is helping out by doing the InScopeProtocol
filters.

Comment entered 2004-01-29 08:33:34 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-01-29 08:33:34
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::17

Finished the include filters except for the Output Structure part. I'm not
quite sure how that part should be done. What has been completed so far can be
viewed at http://bach.nci.nih.gov/cgi-bin/cdr/Help.py?flavor=System. They're
down at the bottom of the page. The one's that don't have asterisks before
them are done/started.

Comment entered 2004-01-29 08:44:27 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2004-01-29 08:44:27
BZCOMMENTOR::Bob Kline
BZCOMMENT::18

Volker:

Could you review what Nanci has so far and provide her with some
guidance for the output structure portion of the docs for include
filters? Thanks.

Comment entered 2004-01-29 09:59:08 by Burg, Cheryl (NIH/NCI) [E] [X]

BZDATETIME::2004-01-29 09:59:08
BZCOMMENTOR::Cheryl Burg
BZCOMMENT::19

The InScopeProtocol Include filters used in the QC Reports, and the QC Report
filters have been completed. The Output structure sections for the QC filters
will be completed when a method for linking to sample QC report output is
determined. I noted that the HP Summary Reports were not listed in the Index
under QC Filters. I will discuss with Volker before beginning documentation.

Comment entered 2004-01-29 15:37:18 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-01-29 15:37:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::20

I've added an entry for the Redline/Strikeout and Bold/Underline QC reports to
the Systems TOC.

Comment entered 2004-02-04 14:51:43 by Burg, Cheryl (NIH/NCI) [E] [X]

BZDATETIME::2004-02-04 14:51:43
BZCOMMENTOR::Cheryl Burg
BZCOMMENT::21

Completed documentation of Global Change filters now in production.

Comment entered 2004-02-12 08:05:38 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-02-12 08:05:38
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::22

I've been working on the denormalization filter documentation. I use the
schemas to help with the Output Structure part. Starting with the Citation
schema and now the Summary schema there have been a lot of new schema features
(like groups and choices) that I am not quite familiar with. I have ordered a
book all about XML Schemas. It should be here early next week.

Comment entered 2004-02-24 11:22:39 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-02-24 11:22:39
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::23

I'm working on the Output Structure part of the Citation Denomalization. There
are several "Choice"s in the schema for this and I am wondering what the best
way to display this is.

An example is:
<group name="PubDateGroup">
<choice>
<sequence>
<element name="Year" type="integer" />
choice minOccurs="0">
<sequence>
<element name="Month" type="NotEmptyString" />
<element name="Day" type="NotEmptyString" minOccurs="0" />
</sequence>
<element name="Season" type="NotEmptyString" />
</choice>
</sequence>
<element name="MedlineDate" type="NotEmptyString" />
</choice>
</group>

I beleive the possible results are:

Year
or
Year Month
or
Year Month Day
or
Year Season
or
Medline Date

The way I would normally display this is:

PubDateGroup
choice
Year [integer][1]
choice [0:1]
Month [NotEmptyString][1]
Day [NotEmptyString [0:1]
or
Season [NotEmptyString][1]
or
MedlineDate [NotEmptyString][1]

But it is hard to tell where one choice begins and another ends and since
there is a choice between a sequence and a single element it becomes even more
confusing.

Another way to display it could be:

PubDateGroup
choice 1
sequence 1
Year [integer][1]
choice 2 [0:1]
sequence 2
Month [NotEmptyString][1]
Day [NotEmptyString [0:1]
/sequence 2
or
Season [NotEmptyString][1]
/choice 2
/sequence 1
or
MedlineDate [NotEmptyString][1]
/choice 1

But then this is just one small group in an output full of choices that
contain types that have even more choices so it is becoming very difficult to
see where one choice ends and another one begins.

Does anyone have any suggestions on displaying this in a way that is easy to
understand?

Thanks,
Nanci

Comment entered 2004-03-10 20:06:09 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-03-10 20:06:09
BZCOMMENTOR::Volker Englisch
BZCOMMENT::24

I spent a little time to go through some of the filters. Right now I just
tackled the DocTitle filters.
General Comments:
=================

  • I would like to see all of the filters to use the same filter name such as
    "DocTitle for [DocumentType]"
    Most of the filters do use this title but some do not like "DocTitle for
    Glossary Term" or "Inscope Protocol DocTitle Filter"
    We will need to discuss who should be doing this change (if we decide to go
    forward with it) and which parts of the system would be affected.

  • Some of the filters have not listed a single line of comments and are not
    formatted as we all would like it.
    This should be fixed since it makes it easier to read and maintain the
    filters.
    At the same time we could eliminate code that's commented out in several
    filters and remove the bogus <text/> tag that appears here and there.
    Well, these tags are not completely bogus but it would be better to write
    ...<text>;</text>
    instead of
    ...;<text/>
    to avoid a extra space to display after the semicolon.
    Again, we would need to discuss who should be doing this change.

  • I would like to add the document ID of the filter to the Filter Title.
    This has the advantage - if you don't know the filter name but you have the
    title - that you can search for the ID in XMetaL.
    Use the notation CDR67 instead of CDR0000000067 in the title.

  • Add the document ID of the filter to the filter names of the TOC for the
    same reason as above.

  • When creating the table the Title element is mandatory. By leaving this
    element empty the display shows two extra blank lines between the section
    title "Filter Information" and the table.
    I would recomment to either drop the SectionTitle and enter the "Filter
    Information" as the Table/Title text or leave the SectionTitle and enter
    a Table/Title such as "Overview" or list the SectionTitle as "Overview" and
    the Table/Title as "Filter Information".

  • Where exists, I would remove the row "Other Authors" since it is not very
    meaningfull information or you'd extract the information from the CVS log.
    For the initial Author I would prefer to list the full name instead of first
    names only and list TBD instead of "?"
    I believe it's safe to say that the initial author for almost all of these
    filters was Cheryl.

  • Some filters list a name next to the version of the filter. I would probably
    remove this or be consistend and list it everywhere.
    In regard to the last comment you might want to replace the "Other Authors"
    with a row "Last Author" that would name the developer creating the latest
    version in CVS.

  • The country filter has the "Output Structure" as a table row. All others list
    the output structure as a section.

  • I'm not sure yet what the benefit of the "Description" row is but that's
    probably because the filter title and the description are almost identical
    for the DocTitle filters. I might want to come back to this later once I
    looked at some other filter types.
    For now I would just like the description to be listed consistently as:
    Creates DocTitle for [Doctype] documents
    or something like that and please use the document type to replace [Doctype]
    instead of repeating the typos existing in the filter title.

  • When you display an example I would display it as:
    Example
    <TT>This is the example</TT>
    instead of
    <TT>Ex: "This ia the example"</TT>

Specific Comments:
==================
Country)

  • Initial Author is CB

  • Move Output Structure out of the table

  • I'm thinking if you want to write for these trivial filters something like
    DocTitle = /Country/CountryFullName [string]

GlossaryTerm)

  • Filter title should be "DocTitle for GlossaryTerm"

  • IA is CB

  • DocTitle = /GlossaryTerm/TermName [string]

PoliticalSubUnit)

  • IA is CB

  • DocTitle = /PoliticalSubUnit/PoliticalSubUnitFullName [string]

  • Notes: Yes, the comment is wrong.

PublishingSystem)

  • DocTitle = /PublishingSystem/SystemName [string]

  • there are a lot of blank lines in the filter that could be removed.

Summary)

  • to 1.): The schema allows one and only one SummaryTitle. Just because
    there is a for-each loop around the element doesn't mean this element is
    multiply occurring.
    Note: The text node of the title is displayed here. A title like this:
    E = m c<Superscript>2</Superscript>
    would be displayed as
    E = m c2

  • to 2.): Same as above. SummaryAudience exists exactly once per schema.
    I'm not sure if it would make more sense to display the output structure as
    Format:
    SummaryTitle;SummaryType;SummaryAudience
    Example:
    Adrenocortical Carcinoma;Treatment;Patient

instead of
Ex: Adrenocortical Carcinoma;Treatment;Patient
1 ; 2 ; 3
1. SummaryTitle
2. SummaryType
3. SummaryAudience

For me personally it's easier to follow an example once I know the rule
instead the other way around.

ScientificProtocolInfo)
DocTitle =
PrimaryID/IDString;OtherID/IDString[1];ProtocolTitle[@Type='Original']
output limited to 255 chars

OutOfScopeProtocol)
DocTitle =
PrimaryID/IDString;[OtherID/IDString[1]];ProtocolTitle/TitleText
output limited to 255 chars

Note: only the first OtherID element is displayed. If none exists the
output displays to semi-colons next to each other.
The text node of the TitleText element is displayed.
Answer to your question:
Yes, the code commented out could be deleted.

Miscellanous)
MiscellaneousDocumentTitle;MiscellaneousDocumentType[;Language]
Language is displayed for Spanish documents only

InScopeProtocol)
PrimaryID/IDString;[OtherID/IDString;[OtherID/IDString;...]]\
ProtocolTitle[@Type='Professional']
Output limited to 255 chars
The OutOfScope and InScope protocol DocTitle filters should probably be
combined or use a similar format.

Citation)

  • Remove "Other Authors" row

DocTitle = PDQCitation/CitationTitle|PubmedArticle/../ArticleTitle
output limited to 255 chars
only the text node is displayed

Documentation)

  • Remove the bulleted list and use an ordered list instead

DocTitle = Body/DocumentationTitle;MetaData/Function;MetaData/DocType[1];\
Documentation[@InfoType]

DocumentationToC)
DocTitle = DocumentationToC/ToCTitle;DocumentationToC/@Use

Term)
DocTitle = PreferredName;TermType/TermTypeName;[TermType/TermTypeName;[...]]\
[SemanticType[@cdr:ref]]
with
SemanticType = TerminologyLink to /Term/PreferredName

Organization)
DocTitle = [Status/CurrentStatus;]OfficialName/Name;[ShortName/Name;]\
City;[CitySuffix;]\
Country[@cdr:ref]|PoliticalSubUnit_State[@cdr:ref]
with
CurrentStatus = displayed only if value = 'Inactive'
Country = CountryLink to /Country/CountryFullName
PoliticalSubUnit_State =
StateLink to /PoliticalSubUnit/PoliticalSubUnitFullName

Sorry, but time's up.
I'll enter my comments/suggestions for Mailer, Person and CTGovProtocol tomorrow.

Comment entered 2004-03-15 16:20:37 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-03-15 16:20:37
BZCOMMENTOR::Volker Englisch
BZCOMMENT::25

Specific Comments (continued):
==============================
CTGovProtocol)
DocTitle =
OrgStudyID[; SecondaryID[; SecondaryID[;...]]];OfficialTitle|BriefTitle
Output limited to 255 chars

Mailer)
DocTitle =
No recipient|Organization/../OfficialName/Name|\
Person/../SurName, Person/../GivenName [SendDate] [Mailer/Type]

  • Display "No recipient" if the cdr:ref of the Recipient points to a
    non-existing document or
    Display the Organization Name or
    Display the Person Name if the Organization Name can not be found

Person)
This one is indeed a little tricky to display on a single line what is
actually happening in the background.
DocTitle = [Inactive;]Person/../SurName, GivenName;City;\
PoliticalSubUnit_State|Country
City and State/Country are selected from that office location that's
matching the CIPSContact element which is either
Home|PrivatePracticeLocation|OtherPracticeLocation/SpecificPostalAddress|\
OtherPracticeLocation/Organization/OrganizationLocations/\
OrganizationLocation/Location
The status flag (i.e. Inactive) is only displayed for inactive documents.

Comment entered 2004-03-15 17:18:26 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-03-15 17:18:26
BZCOMMENTOR::Volker Englisch
BZCOMMENT::26

Comment to "Denorm Filter (1/1): Terminology"

Please see the following corrected Output Structure (modifications are marked
with "***").

Term
DocId [string] [1]
CdrDocCtl
DocValStatus [string]
DocTitle [string]
DocComment [string]
ReadyForReview [string]
Create
Date [date]
User [string]
Modify
Date [date]
User [string]
PreferredName [string] [1]
OtherName [0:many]
OtherTermName [string] [1]
OtherNameType [string] [1:many]
SourceInformation [string] [0:1]
ReiviewStatus [string] [0:1]
Comment [string] [0:1]
Definition [0:many]
DefinitionText [string] [0:many]
DefinitionType [string] [1]
DefinitionSource [0:1]
DefinitionSourceName [string] [1]
DefinedTermId [string] [0:1]
Comment [string] [0:1]

      • TermType [1:many]
        TermTypeName [string] [0:1]

      • SemanticType [0:many]

      • Term [1]
        @cdr:ref [string - CDR[0-9]{10}] [1]
        @PdqKey [string] [0:1]
        PreferredName
        SemanticTypeText [string] [0:many]

      • TermRelationship [1:many]

      • ParentTerm [1]

        *** TermId [1]

        *** Term [1]
        @cdr:ref [string - CDR[0-9]{10}] [1]
        @PdqKey [string] [0:1]

        *** PreferredName [1]

        *** PdqKey [0:1]
        ParentType [string] [1]
        Comment [string] [0:1]

        ***

        RelatedTerm [1]

        *** TermId [1]

        *** Term [1]

        *** @cdr:ref [string - CDR[0-9]{10}] [1]

        *** @PdqKey [string] [0:1]

        *** PreferredName [1]

        *** PdqKey [0:1]

        *** RelationshipType [string] [1]

        *** Comment [string] [0:1]

        *** TermStatus [string] [1]
        MenuInformation [0:1]
        MenuItem [1:many]
        @SortOrder [string] [0:1]
        MenuType [MenuType] [1]
        MenuParent [0:many]
        Term [1]
        @cdr:ref [string - CDR[0-9]{10}] [1]
        PreferredName [string] [1]
        PdqKey [string] [0:1]
        DisplayName [string] [0:1]
        MenuStatus [string] [1]
        EnteredBy [string] [1]
        EntryDate [date] [1]
        Comment [string] [0:1]

        *** Comment [string [0:1]
        DateLastModified [date] [0:1]
        PdqKey [string] [0:1]

Comment entered 2004-03-18 12:11:57 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-03-18 12:11:57
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::27

I've added the "Filter Information" table for most of the filters. I'm now
making sure they are all in the "System Information" list at
http://bach.nci.nih.gov/cgi-bin/cdr/Help.py?flavor=System. I will then go back
and add the "Output Structure" and "Processing Description" for each one.

FEAR THE TURTLE!

Comment entered 2004-03-25 09:58:36 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-03-25 09:58:36
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::28

I've made the changes to the documentation that Volker suggested in comments
#24, #25, and #26 (DocTitle filter documentation and the Term Denorm filter
documentation). I will go through and make the same changes to the rest of the
documentation. Volker, have you had time to think about how to best display
all the many "choices" in the Output Structure? Take a look at the Citation
Denormalization schema for lots of examples of choices.

Comment entered 2004-04-01 12:00:49 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-04-01 12:00:49
BZCOMMENTOR::Volker Englisch
BZCOMMENT::29

  • Filter: Denormalization Filter (1/1): Political SubUnit
    The only thing I would have to mention here is that the Country element is
    listed as having the single cdr:ref attribute but in fact the denormalization
    filter carries over every attribute that exists.
    However, since the PdqKey is depricated the output structure is correct for the
    time being

What is the meaning for the entry "Filters that require it"?
You have listed 3 filters that require❓ the PoliticalSubUnit denormalization
filter. Do you mean that these depend on the output from it?
If so I would think that this is redundant since you've already listed the
filter sets that are using this filter. I would think the "Filters that
require it" would make more sense for global modules since these are not
directly listed in a filter set.

  • Filter: Denormalization Filter: Summary
    Your note says
    > Also waiting to hear from Volker about how to display all the many choices.
    Could you please remind me of what you are waiting for since I don't
    remember?

  • Filter: Module: Citation Denormalization
    Your description is a little misleading. The description says:
    Denormalizes Citations beyond what the Citation Denormalization filter does
    but in fact the Citation Denormalization filter does not do any
    denormalization. All it does is to call the denormalization module which is
    doing all of the denormalization.
    This was done to allow other filters (protocol and summary) to use the
    same citation denormalization.

I don't remember what we decided to do: Move the entire data structure from
the Citation denormalization filter to the denorm filter and refer to it or
display (and possibly repeat) the structure in the denorm filter as you have
done it.

Either way, all that the denormalization filter is doing is to add the DocID
and the cdrctrl information and then hands of everything else to the
citation denorm module.

Comment entered 2004-04-05 08:01:39 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-04-05 08:01:39
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::30

Volker,

To answer your questions plus I have added a few more questions of my own:

  • Filter: Denormalization Filter (1/1): Political SubUnit
    I added the PdqKey attribute. I’ve done the output structure based on the
    schema and an example. However, for “Country” it just says it is
    type “countrylink”. In the schema it just has the two attributes for
    countrylink. There is no mention of the
    elements: “CountryFullName”,” “Continent” or “PostalCodePosition”. They don’t
    appear in the document when the pass through filter is run. There is a line in
    the filter to get the “CountryFullName” but no mention of the “Continent”
    or “PostalCodePostion”. How does this all work?

What do you mean by “the PdqKey is deprecated”?

I’ve put “Filters that require it” in there to list all the filters that
require it to be run first. I thought this would be helpful to see which
filters might be affected if a change is made. I do add an “Includes:” section
if the filter includes another filter.

  • Filter: Denormalization Filter: Summary
    I gave an example of my question about how to best display my “choices”
    delemma in comment #23. However, when you look at this look at the Citation
    schema because it has a lot more choices.

  • Filter: Module: Citation Denormalization
    I write my descriptions based on the titles of the filters. I have no idea
    what any of the filters do. I had no idea why there is a Citation
    denormalization filter AND a denormalization module. Do you have a suggestion
    for a better description for this?

Comment entered 2004-04-05 16:13:40 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-04-05 16:13:40
BZCOMMENTOR::Volker Englisch
BZCOMMENT::31

> Filter: Denormalization Filter (1/1): Political SubUnit
> However, for “Country” it just says it is type “countrylink”. In the schema
> it just has the two attributes for countrylink.

You can find out what the CountryLink resolves to by looking at the
CDRCommonSchema and the CountrySchema.

> They don’t appear in the document when the pass through filter is run.

That is correct because the passthrough filter only displayes the document as
is. No denormalization is done here.
The denormalization filters are doing almost the same - displaying the document
content as is - with the major exception that whenever an element SomethingLINK
is encountered to pull in that document the link is pointing to at this
location.
You encounter an OrganizationLink element--> include the Organization document
here; you encounter a PersonLink element --> include the Person document
here; ...

> What do you mean by “the PdqKey is deprecated”?

It is not used anymore to identify a document. The PDQKey used to be the
unique identifier of a document in the old system. This is now replaced by the
CDRID. Only legacy documents are still displaying this attribute. Those
documents created within the CDR (that did not get converted from the PDQ
system) won't display this attribute.

> - Filter: Denormalization Filter: Summary
> I gave an example of my question about how to best display my “choices”

I don't have a problem with the first format you're offering. As long as the
indentation is correct there shouldn't be a problem with the format.
I'm not able to follow the second display format especially when you assume
that the content of the first and second format is identical.
I'd go with the first option.

> - Filter: Module: Citation Denormalization
> I write my descriptions based on the titles of the filters. I have no idea
> what any of the filters do.

I thought that you'd know after you've documented the filters. :-)
As always, please feel free to ask me if you're not sure what the
functionallity of a filter is.

> Do you have a suggestion for a better description for this?

I have no problem listing both filters as
"Denormalizes Citations"
or maybe for one (Module that Denormalizes Citations) but adding
"beyond what the Citation Denormalization filter does"
implies that there is more denormalization happening in the denorm filter which
is not correct.

Comment entered 2004-04-08 12:11:24 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-04-08 12:11:24
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::32

Hi Guys,

I'm taking a break from the filter documentation and am in the process of
QCing the new NCI web site. I will be back documenting filters during the
lulls in the QC process. I will miss today's meeting. Have a good one!

Nanci

Comment entered 2004-04-15 11:11:51 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-04-15 11:11:51
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::33

We are having a lull in the new NCI Website QC process so I am working on the
Citation Denormalization Filter Documentation. The Output structure is
incredibly ugly. you can take a look at http://bach.nci.nih.gov/cgi-
bin/cdr/Filter.py?
DocId=CDR0000355386&Filter=name:Documentation+Help+Screens+Filter if you'd
like to see it. I'm doing it mostly by the schema because there are so many
choices I can't find any examples. There are some types that are so long and
used so many times and some that are called from inside themselves that I have
separated them from the main output structure and put them at the end and have
pointed to them with asterisks (from * to 8* so far). I will continue on until
there is more to QC. See ya later ... maybe.

Comment entered 2004-04-23 12:26:08 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2004-04-23 12:26:08
BZCOMMENTOR::Volker Englisch
BZCOMMENT::34

Regarding comment #19:
I have created a directory structure that allows us to create links in the
documentation documents to display the QC Report Output as samples for the
output structure.

Within the document use the ExternalRef element (within a Para element) and set
the attribute to
http://bach/nci.nih.gov/cdr/Documentation/QC_Output_Format/CDR12345.html
with 12345 being the documentation document ID.

The QC Report Output itself will have to be saved on BACH under
d:\InetPub\wwwroot\cdr\Documentation\QC_Output_Format\CDR12345.html

Comment entered 2004-05-12 10:56:25 by Burg, Cheryl (NIH/NCI) [E] [X]

BZDATETIME::2004-05-12 10:56:25
BZCOMMENTOR::Cheryl Burg
BZCOMMENT::35

Links to sample output have been added under Output Structure for all QC Report
Filters and Include Filters which generate sections of the InScopeProtocol QC
Reports. In addition sample output has been added for Documentation Help
Screens and the Table Formatter Include filter.

Comment entered 2004-06-03 11:43:23 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-06-03 11:43:23
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::36

I'm back! I've noticed that there are some new filters since I last worked on
this so I will begin with them. I finished the documentation for "DocTitle for
PDQBoardMemberInfo" and have started working on documenting the
denormalization filter for PDQBoardMemberInfo.

Comment entered 2004-06-17 11:57:41 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-06-17 11:57:41
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::37

No progress to report. Cheryl and I are working on a special project for Sue
(for Gisele) so we will be unable to attend today's meeting.

Comment entered 2004-07-01 08:20:58 by Gottlieb, Nanci (NIH/NCI) [E] [X]

BZDATETIME::2004-07-01 08:20:58
BZCOMMENTOR::Nanci Gottlieb
BZCOMMENT::38

Status of Filter Documentation

I’ve pretty much have them all started. I have a documentation document for
each filter. They can be viewed at http://bach.nci.nih.gov/cgi-bin/cdr/Help.py?
flavor=System. The one’s with asterisks in front have not been done except
for “Copy XML for Organization”, CDR0000315594 (Documentation - CDR0000360876)
and “Revision Markup Filter” , CDR0000000093 (Documentation - CDR0000362050).
I tried to go in and add them to the System Information” document
(CDR0000256207) but Volker has had it checked out the last couple of days.

The following documents still need to be documented completely:
Denormalization Filter: PDQBoardMemberInfo - CDR0000000129
Denormalization of Contact Information for Person Orgs and Protocols
Module: Emailer Common
Module: General Markup Formatter
Copy XML for Citation QC Report
Copy XML for InScopeProtocol
Copy XML for PDQBoardMemberInfo
Copy XML for Person - QC Report
Copy XML for Person QC Report

Only the first one is on Bach which is why I probably haven’t started them yet.

All the rest have the Filter Information section completed, some have Output
Structure section completed, some do not. Some have Processing Description
section completed, some do not.

I’m currently working on getting the statistics project up and running. I’m
not sure when I will be able to get back to the documentation. It might be a
good idea for whoever wrote each filter to complete the documentation for that
filter since they best understand what it is the filter does.

Thanks,
Nanci

Comment entered 2004-07-08 14:10:04 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2004-07-08 14:10:04
BZCOMMENTOR::Bob Kline
BZCOMMENT::39

Dropped priority for now.

Comment entered 2012-01-03 11:08:41 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2012-01-03 11:08:41
BZCOMMENTOR::Volker Englisch
BZCOMMENT::40

(In reply to comment #39)
> Dropped priority for now.

Bob, I think this issue should get closed or at least reassigned. It's been a P10 for a long, long time.

Comment entered 2012-01-03 11:12:47 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2012-01-03 11:12:47
BZCOMMENTOR::Bob Kline
BZCOMMENT::41

We've done as much as we're going to do for now.

Comment entered 2012-01-03 11:12:59 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2012-01-03 11:12:59
BZCOMMENTOR::Bob Kline
BZCOMMENT::42

Closing.

Elapsed: 0:00:00.001433