CDR Tickets

Issue Number 2954
Summary [Genetics Directory] Vendor filter changes for publication from the CDR
Created 2009-08-27 16:22:43
Issue Type Improvement
Submitted By Beckwith, Margaret (NIH/NCI) [E]
Assigned To Englisch, Volker (NIH/NCI) [C]
Status Closed
Resolved 2010-04-08 15:58:05
Resolution Fixed
Path /home/bkline/backups/jira/ocecdr/issue.107282
Description

BZISSUE::4629
BZDATETIME::2009-08-27 16:22:43
BZCREATOR::Margaret Beckwith
BZASSIGNEE::Volker Englisch
BZQACONTACT::William Osei-Poku

We need to make changes to the vendor filter in order to publish the Genetics Directory from the CDR after conversion of the data.

Comment entered 2009-08-31 12:12:39 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-08-31 12:12:39
BZCOMMENTOR::Volker Englisch
BZCOMMENT::1

Will there be any changes to the licensee DTD?

Comment entered 2009-09-04 14:18:04 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-09-04 14:18:04
BZCOMMENTOR::Volker Englisch
BZCOMMENT::2

According to Bob there will be data to work with on MAHLER sometime next week.

The licensee DTD will not change.

Comment entered 2009-09-23 13:12:10 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-09-23 13:12:10
BZCOMMENTOR::Volker Englisch
BZCOMMENT::3

I have a question about the output of the vendor filters.

Is our goal to deliver documents that are identical (or as close as possible) to what we used to deliver from the point-of-view of the source of the document or the structure of the document?
For example:
The data is currently stored in the CDR like this

<ID>
12
</ID>
<NAME>
<SNAME>
Englisch
</SNAME>
<FIRSTNAME>
Volker
</FIRSTNAME>
</NAME>
...

and this is how we deliver the data to our licensees instead of something like this (which has the advantage that the text-node doesn't include white space in front and at the end of the data:

<ID>12</ID>
<NAME><SNAME>Englisch</SNAME><FIRSTNAME>Volker</FIRSTNAME></NAME>
...

My suggestion would be to use the same format as all other CDR document types but we wouldn't be able to create diff reports between the old and the new data formats.

Comment entered 2009-10-06 19:45:08 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-10-06 19:45:08
BZCOMMENTOR::Volker Englisch
BZCOMMENT::4

(In reply to comment #3)
> My suggestion would be to use the same format as all other CDR document types

Per discussion at the status meeting this is what we're going to do.

Comment entered 2009-10-06 19:47:47 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-10-06 19:47:47
BZCOMMENTOR::Volker Englisch
BZCOMMENT::5

I noticed that the degree for the GenProf documents are displayed with periods as in
M.D., Ph.D., M.S., etc.
while the LOV for the element StandardProfessionalSuffix doesn't display the periods
MD, PhD, MS, etc.

Are we keeping the values as defined in the LOV or do we need to convert them in the filter?

Comment entered 2009-10-07 15:19:42 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-10-07 15:19:42
BZCOMMENTOR::Volker Englisch
BZCOMMENT::6

Another Data question:
The country is listed as 'United States' in the current data which is the CountryShortName in the country document.
We are typically using the CountryFullName for display with is 'U.S.A.' for the US.

Should we use United States or U.S.A. as the country name?

If we change the display it may be possible that Cancer.gov will display the country for US addresses as well as for foreign countries but that might happen anyway because we won't submit the extra spaces anymore (see Comment #3).

Comment entered 2009-10-07 18:33:50 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-10-07 18:33:50
BZCOMMENTOR::Volker Englisch
BZCOMMENT::7

I've finished the filter as much as I could without seeing the actual data.
What's left to be done is the CancerType/CancerSite information and finishing up the address information.
The filter currently used is
CDR650153 - [Test] Vendor GenProf

Comment entered 2009-10-09 15:30:22 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-10-09 15:30:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::8

(In reply to comment #6)
> If we change the display it may be possible that Cancer.gov will display the
> country for US addresses as well as for foreign countries but that might
> happen anyway because we won't submit the extra spaces anymore (see
> Comment #3).

I've talked with Blair about this issue. The Gatekeeper/Cancer.gov software is checking for the string 'United States' in three places (as far as we could see) and the publishing of the new GenProf. documents would need to be coordinated with a change of the software.
In short, changes will be necessary on the Cancer.gov if we change the country display from 'United States' to 'U.S.A.' but those changes will be minor.

Comment entered 2009-10-28 16:19:40 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2009-10-28 16:19:40
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::9

Currently in the Genetics Directory on Cancer.gov we leave the country off it is the United States. Could we just do that (or maybe I am missing the problem here).

Comment entered 2009-10-28 16:29:23 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-10-28 16:29:23
BZCOMMENTOR::Volker Englisch
BZCOMMENT::10

(In reply to comment #9)
> Currently in the Genetics Directory on Cancer.gov we leave the country off

That's correct but Cancer.gov has code that looks for the string 'United States'. When it sees this string it will suppress the display of the country. After the conversion we will send the string 'U.S.A.' to Cancer.gov and therefore the country would be displayed without a code/string change.
As I said, it's a minor change but a change that will need to be coordinated with the implementation of our filters to create the "new" documents.

Comment entered 2009-10-28 16:33:07 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2009-10-28 16:33:07
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::11

Thanks for the explanation! This makes sense, and I figured it must be something along those lines.

Comment entered 2009-11-02 15:33:50 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-11-02 15:33:50
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::12

Here are a few observations from our test on Franck comparing with data from the cancer.gov web site. I am pretty sure some of these things may have been addressed already in the vendor filter but I just wanted to include them here just in case you are not aware of them.

1. Zip code
When zip codes have 4 digit extensions, the hyphen is not displayed. For example
GP 362 (CDR658172 on Franck) - Erin R. Dola
Web site http://www.cancer.gov/search/view_geneticspro.aspx?personid=556226

GP 245 (CDR658198) Linda Robinson

Web site http://www.cancer.gov/search/view_geneticspro.aspx?personid=556055

2. City names without comma
The address information is displayed on the web site without the usual comma that follows the city name: Examples, same as above.
SALT LAKE CITY UT 84112 5550
SALT LAKE CITY, UT 84112-5550

3. Email address
In the Gen Prof database, the email address of the professional is entered in only one location but on the web site, it is added to all locations (in the case of multiple locations). In the same way in the CDR, the email address would usually be found at the CIPS contact location. I guess CIAT need not do anything if this will continue to be the case after the conversion.

Comment entered 2009-11-06 12:42:42 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-06 12:42:42
BZCOMMENTOR::Volker Englisch
BZCOMMENT::13

(In reply to comment #12)
> 1. Zip code
> When zip codes have 4 digit extensions, the hyphen is not displayed.

Good, we do want to display the ZIP+4 properly including the hyphen.

> 2. City names without comma

I believe this will require a change request to the Cancer.gov team.
I'll mention this in combination with the switch for the country from
United States to U.S.A.

> 3. Email address
> In the Gen Prof database, the email address of the professional is entered in
> only one location but on the web site, it is added to all locations (in the
> case of multiple locations). In the same way in the CDR, the email address
> would usually be found at the CIPS contact location. I guess CIAT need not do
> anything if this will continue to be the case after the conversion.

I'm not sure what you are trying to say here.
Looking at the sample of Linda Robinson, the data that we provide to Cancer.gov does have the email address listed for each location. Therefore, the Cancer.gov correctly displays the email address submitted with each of the location blocks.
In the CDR there is only one email address listed for one of the location blocks.
What are you proposing for the vendor filters to do?
a) Create the email based on the SpecificEmail element if it exists
b) Create the same email for all of the location blocks if a single
SpecificEmail element exists for any of the location blocks
c) What would you want to do if two different emails exist?

Comment entered 2009-11-06 12:57:30 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-06 12:57:30
BZCOMMENTOR::Volker Englisch
BZCOMMENT::14

In the GeneticsProfessional data (the current vendor output) there are three children of the name element:
<NAME>
<SNAME>
<FIRSTNAME>
<LASTNAME>
</NAME>

Can anyone tell me how the SNAME element is being constructed and if this element is being used by Cancer.gov?

It appears that this element is something like a display name and it combines the first character of the first name with the middle initial and the last name like:
Volker Englisch --> V Englisch
James M. Karns --> JM Karns

but it also contains things like
Diana Moglia --> M Moglia

It doesn't appear that this element is being used at all on Cancer.gov (I'll double-check with Blair) but it would be nice to know how this element should get constructed.

Comment entered 2009-11-06 13:40:49 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-06 13:40:49
BZCOMMENTOR::Volker Englisch
BZCOMMENT::15

(In reply to comment #13)
> I believe this will require a change request to the Cancer.gov team.
> I'll mention this in combination with the switch for the country from
> United States to U.S.A.

I was wrong about this one.

Cancer.gov is displaying the address block as its been presented to them via the CADD elements and it is not constructing the address from the City, State, Zip information.
I will modify the vendor filter accordingly.

Comment entered 2009-11-06 14:02:32 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-11-06 14:02:32
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::16

(In reply to comment #13)

> > 3. Email address
> > In the Gen Prof database, the email address of the professional is entered in
> > only one location but on the web site, it is added to all locations (in the
> > case of multiple locations). In the same way in the CDR, the email address
> > would usually be found at the CIPS contact location. I guess CIAT need not do
> > anything if this will continue to be the case after the conversion.
>
> I'm not sure what you are trying to say here.
> Looking at the sample of Linda Robinson, the data that we provide to Cancer.gov
> does have the email address listed for each location. Therefore, the
> Cancer.gov correctly displays the email address submitted with each of the
> location blocks.
> In the CDR there is only one email address listed for one of the location
> blocks.

You're right and we recognize all the above.

> What are you proposing for the vendor filters to do?
> a) Create the email based on the SpecificEmail element if it exists
> b) Create the same email for all of the location blocks if a single
> SpecificEmail element exists for any of the location blocks
> c) What would you want to do if two different emails exist?

We are going to maintain the email address only at the SpecificEmail of the CIPContact. So this was just a heads up. Because we have already said that nothing was going to change on Cancer.gov, I wanted you to be aware of where the email address would be so that you can still include it at every location, in case of multiple locations. In other words, we will continue to have one email address but it needs to continue to be displayed at all locations.

Comment entered 2009-11-06 14:37:15 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-11-06 14:37:15
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::17

(In reply to comment #14)
> In the GeneticsProfessional data (the current vendor output) there are three
> children of the name element:
> <NAME>
> <SNAME>
> <FIRSTNAME>
> <LASTNAME>
> </NAME>
>
> Can anyone tell me how the SNAME element is being constructed and if this
> element is being used by Cancer.gov?
>
> It appears that this element is something like a display name and it combines
> the first character of the first name with the middle initial and the last name
> like:
> Volker Englisch --> V Englisch
> James M. Karns --> JM Karns
>
> but it also contains things like
> Diana Moglia --> M Moglia
>
> It doesn't appear that this element is being used at all on Cancer.gov (I'll
> double-check with Blair) but it would be nice to know how this element should
> get constructed.

I am adding Bob (and Margaret) to this issue to see if he can answer this question. I am not sure how it is being used. For the converted documents we have some of the combinations you mentioned above but for most of the records, they were converted as GivenName, MiddleInitial and SurName.

Comment entered 2009-11-06 14:40:47 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-06 14:40:47
BZCOMMENTOR::Volker Englisch
BZCOMMENT::18

(In reply to comment #16)
> We are going to maintain the email address only at the SpecificEmail of the
> CIPContact.

I see. I wasn't aware of this.
I have updated the vendor filter to display the email address listed as the SpecificEmail of the CIPSContact location for any location displayed.

Comment entered 2009-11-06 14:53:31 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-06 14:53:31
BZCOMMENTOR::Bob Kline
BZCOMMENT::19

(In reply to comment #17)
> (In reply to comment #14)
> > In the GeneticsProfessional data (the current vendor output) there are three
> > children of the name element:
> > <NAME>
> > <SNAME>
> > <FIRSTNAME>
> > <LASTNAME>
> > </NAME>
> >
> > Can anyone tell me how the SNAME element is being constructed and if this
> > element is being used by Cancer.gov?

No idea. We've always just passed through what we get from CIAT without touching (or needing to understand) any of it.

Comment entered 2009-11-09 08:30:11 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-09 08:30:11
BZCOMMENTOR::Bob Kline
BZCOMMENT::20

If I had to guess I would say that SNAME probably stands for "short name" and would be constructed for matching the form of authors' names as they are cited in scholarly bibliographies, with everything reduced to initials except the surname.

Comment entered 2009-11-09 18:38:41 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-09 18:38:41
BZCOMMENTOR::Volker Englisch
BZCOMMENT::21

Is it correct that the CancerType/Typename within a FamilyCancerSyndrome is sorted as well as the CancerSite children within each CancerType?

I have copied the vendor and denormalization filter for the GeneticsProfessionals to FRANCK at this point. The filter creates valid XML output but I still have to sort and consolidate the individual CancerType sections.

Comment entered 2009-11-10 17:52:08 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-10 17:52:08
BZCOMMENTOR::Volker Englisch
BZCOMMENT::22

I was able to finish the display of the FamilyCancerSyndrome sections on MAHLER.
It doesn't look close to what is send to Gatekeeper but given the limited MenuInformation on MAHLER that's probably expected.

I've used the document CDR19859 (Clark Robin) as my primary test case.

The following filter has been created:
CDR652508 - Copy XML for GeneticsProfessional

Please note that the new filters are not on FRANCK anymore since FRANCK had been refreshed today.

This is ready for review on MAHLER.

Comment entered 2009-11-12 15:35:43 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-12 15:35:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::23

Per request I'm attaching the XML output of the new GeneticsProfessional filter to this issue. This was created on MAHLER with data I prepared myself.

Comment entered 2009-11-12 15:35:43 by Englisch, Volker (NIH/NCI) [C]

Attachment GenProf_19859_Vendor.xml has been added with description: Vendor Output for CDR19859

Comment entered 2009-11-12 15:59:45 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-12 15:59:45
BZCOMMENTOR::Volker Englisch
BZCOMMENT::24

(In reply to comment #16)
> We are going to maintain the email address only at the SpecificEmail of the
> CIPContact.

Per out discussion at the status meeting the email address to be picked up will not come from the address block with the CIPSContract fragment but from the address block containing the
UsedFor = "Mailer"
block.
This will need to be modified in the vendor filter.

Comment entered 2009-11-12 16:04:37 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-12 16:04:37
BZCOMMENTOR::Bob Kline
BZCOMMENT::25

(In reply to comment #24)
> (In reply to comment #16)
> > We are going to maintain the email address only at the SpecificEmail of the
> > CIPContact.
>
> Per out discussion at the status meeting the email address to be picked up will
> not come from the address block with the CIPSContract fragment but from the
> address block containing the
> UsedFor = "Mailer"
> block.
> This will need to be modified in the vendor filter.

To be more specific, you'll want to look in the block which has the NMTOKEN GPMailer in the NMTOKENS UsedFor attribute. The token 'GPMailer' might not be the only value in the attribute. It might, for example, but 'GP GPMailer' (or 'GPMailer GP') in the case where the information in the practice location block repeats information found in the tblMain table.

Comment entered 2009-11-16 17:48:54 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-16 17:48:54
BZCOMMENTOR::Volker Englisch
BZCOMMENT::26

(In reply to comment #25)
> To be more specific, you'll want to look in the block which has the NMTOKEN
> GPMailer in the NMTOKENS UsedFor attribute. The token 'GPMailer' might not be
> the only value in the attribute. It might, for example, but 'GP GPMailer' (or
> 'GPMailer GP') in the case where the information in the practice location
> block repeats information found in the tblMain table.

That's exactly what I meant to say. :-)

I've modified the vendor filter to pick up the email address from the address block marked with the UsedFor = "... GPMailer ..." attribute instead of the CIPSContact block.

Comment entered 2009-11-20 17:21:02 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-20 17:21:02
BZCOMMENTOR::Volker Englisch
BZCOMMENT::27

I've copied the modified vendor filters to FRANCK for testing but I will first need to modify the SELECT statements of the publishing document to pick up the appropriate Person documents for publishing.

I am guessing that the criteria for picking up a Person document to be processed as a GeneticsProfessional is the "Include in Directory" element.
Please let me know if this is not correct.

Comment entered 2009-11-20 17:43:40 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-20 17:43:40
BZCOMMENTOR::Bob Kline
BZCOMMENT::28

(In reply to comment #27)
> I've copied the modified vendor filters to FRANCK for testing but I will first
> need to modify the SELECT statements of the publishing document to pick up the
> appropriate Person documents for publishing.
>
> I am guessing that the criteria for picking up a Person document to be
> processed as a GeneticsProfessional is the "Include in Directory" element.
> Please let me know if this is not correct.

Right. There are two places in the schemas where the Include element can appear: be sure you're looking at the one in the GeneticsProfessionalDetails block (not the PhysicianDetails block).

Comment entered 2009-11-23 10:38:31 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-23 10:38:31
BZCOMMENTOR::Volker Englisch
BZCOMMENT::29

Bob, I am now picking up 499 person documents (on FRANCK) to be published as GenProf documents. The last publishing job produced 535 documents.
I'm thinking you may have a few duplicates and possibly invalid documents that could account for a lower number of person documents used for GP.
What number of GP documents would you expect?

Comment entered 2009-11-23 11:13:04 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-23 11:13:04
BZCOMMENTOR::Bob Kline
BZCOMMENT::30

Let me do some digging.

Comment entered 2009-11-23 13:08:43 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-23 13:08:43
BZCOMMENTOR::Bob Kline
BZCOMMENT::31

I would have expected 497, not 499, because that's the number which have 'Include' for /AdministrativeInformation/Directory/Include under the GeneticsProfessionalDetails block in the query_term_pub table. The reason it's lower than 535 is that the conversion resulted in some invalid documents which CIAT plans to fix once the conversion is done on Bach (mostly non-US people, because of the funky way the genprof database shoehorned their addresses into a table structure which assumed US addresses).

Comment entered 2009-11-23 13:53:47 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-23 13:53:47
BZCOMMENTOR::Volker Englisch
BZCOMMENT::32

> I would have expected 497, not 499, because that's the number which have
> 'Include' for /AdministrativeInformation/Directory/Include under the
> GeneticsProfessionalDetails block in the query_term_pub table.

It looks like there are two documents that are blocked (CDR10700, CDR19862) and two documents for which the latest version in doc_version is not publishable and the latest publishable version doesn't include the 'Include' flag (CDR3766, CDR7295). Therefore, there will only be 495 documents created on FRANCK.

Comment entered 2009-11-23 16:54:18 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-23 16:54:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::33

I've modified the publishing document
CDR000178.xml
and installed it on FRANCK to run a publishing job.
The publishing job selected 495 document to be published of which 14 failed (see the failure report) due to missing data.
http://franck.nci.nih.gov/cgi-bin/cdr/PubStatus.py?id=6637&type=FilterFailure

The remaining 481 documents are valid but differ in many elements from the currently published documents.

I'll attach a sample for review.

Comment entered 2009-11-23 16:57:23 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-23 16:57:23
BZCOMMENTOR::Volker Englisch
BZCOMMENT::34

Comment entered 2009-11-23 16:57:23 by Englisch, Volker (NIH/NCI) [C]

Attachment CDR556157.xml has been added with description: GP PJ LeMarbre - Old output

Comment entered 2009-11-23 16:58:40 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-11-23 16:58:40
BZCOMMENTOR::Volker Englisch
BZCOMMENT::35

Comment entered 2009-11-23 16:58:40 by Englisch, Volker (NIH/NCI) [C]

Attachment CDR828.xml has been added with description: GP PJ LeMarbre - New output (FRANCK)

Comment entered 2009-11-24 09:22:14 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-11-24 09:22:14
BZCOMMENTOR::Bob Kline
BZCOMMENT::36

For the TYPE element, I think you can have the filter strip everything from the first occurrence of " (" onward. For SPECIALTY/BDCT I think you're only supposed to include that if the value is "Yes." As for the syndromes, it looks as if there are so many changes (beyond capitalization) that it might be good to get CIAT to confirm that they're intentionally making extensive changes to the cancer types and sites associated with the various syndromes.

Comment entered 2009-12-01 11:42:03 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-12-01 11:42:03
BZCOMMENTOR::Bob Kline
BZCOMMENT::37

William:

How's progress coming on the review of Volker's publication results? We can't go any further with #4522 until this task is wrapped up.

Comment entered 2009-12-01 12:19:34 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-01 12:19:34
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::38

(In reply to comment #37)
> William:
>
> How's progress coming on the review of Volker's publication results? We can't
> go any further with #4522 until this task is wrapped up.

We will look at the results posted in comment #35 and post a comment this afternoon. I must have misunderstood what the next step was. I thought we were waiting for the publish preview to be completed before reviewing the results.

Comment entered 2009-12-01 13:26:14 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-01 13:26:14
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::39

(In reply to comment #35)
> Created an attachment (id=1824) [details]
> GP PJ LeMarbre - New output (FRANCK)

I have a few observations:

1. It looks like the Institution will display as "Regional Cancer Center". However, looking at the CDR document, I think it should display as "Waukesha Memorial Hospital Regional Cancer Center". "Regional Cancer Center" is the old data from the Gen Prof. database which currently appears on cancer.gov but it has been updated in the CDR to "Waukesha Memorial Hospital Regional Cancer Center". I am assuming you are using the CDR data in which case it should display what is in the CDR.

2. (This is just an FYI) The Postal code in the new output is "53188-". This is exactly how it is in the CDR and it was inherited from the Gen Prof Database. This will eventually be cleaned up. Currently on Cancer.gov, the zip code is displayed correctly without the hyphen. I am assuming this is as a result of the changes in you made in comment # 13.

3. It appears the “Cowden syndrome” was not picked up in your output. The tag that was supposed to contain it appears to be empty.

Everything else looks good and appears to be consistent with what is currently displayed on cancer.gov

Comment entered 2009-12-01 13:50:32 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-01 13:50:32
BZCOMMENTOR::Volker Englisch
BZCOMMENT::40

(In reply to comment #39)
> 3. It appears the “Cowden syndrome” was not picked up in your output. The tag
> that was supposed to contain it appears to be empty.

The Menu information for this term doesn't list the DisplayName. That's whey the element is empty. I'm guessing this is fixed on BACH.

> Everything else looks good and appears to be consistent with what
> is currently displayed on cancer.gov

This means it's OK that the terms displayed on Cancer.gov are different from what's picked up by the filter?

Comment entered 2009-12-01 14:33:08 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2009-12-01 14:33:08
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::41

There is a display name for Cowden syndrome menu information on Bach. But I don't completely understand your question Volker. The syndrome names are supposed to match exactly what is on Cancer.gov, but the associated cancer types and cancer sites have been updated a bit so may not match exactly. I asked you if that was okay, and you said it was. Am I confused here?

Comment entered 2009-12-01 14:55:22 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-01 14:55:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::42

(In reply to comment #41)
> I asked you if that was okay, and you said it was. Am I confused here?

It is OK from a processing point of view. I just want to make sure that the information that is picked up by the filters matches the information that is expected. I can see from the output created that it is different but I'm unable to tell if it's "different/good" or "different/bad".

Comment entered 2009-12-01 15:12:19 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-01 15:12:19
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::43

(In reply to comment #42)
> (In reply to comment #41)
> > I asked you if that was okay, and you said it was. Am I confused here?
>
> It is OK from a processing point of view. I just want to make sure that the
> information that is picked up by the filters matches the information that is
> expected. I can see from the output created that it is different but I'm
> unable to tell if it's "different/good" or "different/bad".

Yes. Mary and I looked at them together and she said they matched correctly.

Comment entered 2009-12-01 18:10:27 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-01 18:10:27
BZCOMMENTOR::Volker Englisch
BZCOMMENT::44

(In reply to comment #36)
> For SPECIALTY/BDCT I think you're only
> supposed to include that if the value is "Yes."

Margaret, I'm wondering if this is correct because I do see "Board certified = NO"
entries on Cancer.gov.

Comment entered 2009-12-01 19:51:36 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-01 19:51:36
BZCOMMENTOR::Volker Englisch
BZCOMMENT::45

(In reply to comment #39)
> 2. (This is just an FYI) The Postal code in the new output is "53188-". This
> is exactly how it is in the CDR

Yes, that's what I'm doing. I merely display the PostalCode_ZIP as it's listed in the data.

I've updated the filters to address Bob's Comment #36 (a) and William's comment #39 (1).

The filters are on FRANCK.

Comment entered 2009-12-07 16:25:33 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-07 16:25:33
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::46

(In reply to comment #45)
> (In reply to comment #39)
> > 2. (This is just an FYI) The Postal code in the new output is "53188-". This
> > is exactly how it is in the CDR
>
> Yes, that's what I'm doing. I merely display the PostalCode_ZIP as it's listed
> in the data.
>
> I've updated the filters to address Bob's Comment #36 (a) and William's comment
> #39 (1).
>
> The filters are on FRANCK.

Can I use the Filter Document report to test other GP documents?

Comment entered 2009-12-07 16:27:52 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-07 16:27:52
BZCOMMENTOR::Volker Englisch
BZCOMMENT::47

(In reply to comment #46)
> Can I use the Filter Document report to test other GP documents?

Yes.

Comment entered 2009-12-08 15:17:45 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-08 15:17:45
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::48

(In reply to comment #47)
> (In reply to comment #46)
> > Can I use the Filter Document report to test other GP documents?
>
>
> Yes.

Is there a way to test with data from Bach without refreshing Franck? I was able to filter three documents and saw a few missing values that may have been fixed on Bach. It looks like either testing on Bach or with Bach data is the appropriate thing to do at this point.

Comment entered 2009-12-08 15:23:43 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-08 15:23:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::49

> Is there a way to test with data from Bach without refreshing Franck?

No, you can only filter documents that are on the local server with this interface.

Comment entered 2009-12-11 13:26:04 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-11 13:26:04
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::50

(In reply to comment #49)
> > Is there a way to test with data from Bach without refreshing Franck?
>
> No, you can only filter documents that are on the local server with this
> interface.

Mary and I QCed the filtered documents using the terminology report. She fixed a few things on Bach and everything appears to be fine at this point. We will QC again when all the Genetics Professional tasks are promoted to Bach. At this point everything else looks good.

Comment entered 2009-12-14 12:12:54 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2009-12-14 12:12:54
BZCOMMENTOR::Bob Kline
BZCOMMENT::51

Next step on this issue is for Volker to draft, get approval for, and
send out a message to the licensees informing them that we will no longer be
including Person documents in the weekly export, and asking them to let us know
of any impact that might have on their operations.

Comment entered 2009-12-15 16:12:47 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-15 16:12:47
BZCOMMENTOR::Volker Englisch
BZCOMMENT::52

The notification to the licensees went out yesterday.

Comment entered 2009-12-21 12:16:33 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-21 12:16:33
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::53

The application submission form on cancer.gov is currently sent to the Lockheed/Aspensys email address. It should rather be sent to this new email address-GeneticsDirectory@cancer.gov. I am adding this comment so that this change will be included with the other cancer.gov changes for the Genetics Directory.

Comment entered 2009-12-21 12:22:22 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-21 12:22:22
BZCOMMENTOR::Volker Englisch
BZCOMMENT::54

William, did you mean to add this last comment to the vendor filter issue?
There isn't any place in the Person data that uses this information, right?

Comment entered 2009-12-21 12:45:34 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2009-12-21 12:45:34
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::55

(In reply to comment #54)
> William, did you mean to add this last comment to the vendor filter issue?
> There isn't any place in the Person data that uses this information, right?

Yes and you're right, this does not affect the vendor filter changes you are making and I probably should not have included it here but I already have the same comment in OCECDR-2847 where the email address also needs to be added to the mailer document. I probably may need to create another issue specifically for Genetics Directory cancer.gov issues. I will check with Margaret first.

Comment entered 2009-12-28 14:02:59 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2009-12-28 14:02:59
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::56

I don't think there is a need to create an issue for the Cancer.gov issues. I am setting up a meeting with CHristine to talk about several changes that need to be made to the application form on Cancer.gov, including the email change.

Comment entered 2009-12-31 16:41:30 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2009-12-31 16:41:30
BZCOMMENTOR::Volker Englisch
BZCOMMENT::57

I've modified the publishing document and ran a small publishing export job (10 documents per document type) that successfully loaded on the Gatekeeper test machine.
I ran this job without previously removing the existing GenProf document which means that for these 10 documents the old and new output could now be compared easily. You could do this on
http://wwwgk.cancer.gov/search/geneticsservices/
and search for the following names:
Joseph LeMarbre
John J. Mulvihill
David Ginsburg
David G. Mutch
Mary B. Daly
Robert D. Burk
Kathy J. Helzlsouer
David Smotkin
Kenneth Offit
For the new version you will see the Prof. Suffix displayed without periods (MD) while the old version displays this data with a period (M.D.)

The following documents have been updated on FRANCK:
CDR178.xml (Publishing document)
cdrpub.py - R9458
pdqCG.dtd - R9441
pdq.dtd - R9459

Next, we will refresh FRANCK and then run a before/after publishing job.

Comment entered 2010-01-04 13:14:47 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2010-01-04 13:14:47
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::58

My comments:

Data issues (I think)
1. On Kathy Helzlsoer record, email address looks like a personal email instead of the general one on the old form. Is this what we want?
2. Some of the types of cancer are lower case and should have the menu information fixed so that they match the others (e.g., hepatoblastoma, islet cell, renal transitional).
3. Zip code on John Mulvihill record has dash after it.
4. Professional suffixes on John Mulvihill record only shows MD, not the BS, BMS--is this intentional?
5. Address for Mary Daly missing Family Risk Assessment Program. Also, typo in street address need to be fixed.

Other issues:
6. Syndrome names are missing from the table, even though there is an empty cell, and the cancer types are present. Examples on John Mulvihill record are Bloom syndrome; Carcinoid, familial; Cowden syndrome; Fanconi anemia; Rothmund Thomson syndrome.
7. Two practice locations showing on new record;see Kenneth Offit record, David Ginsburg record. It looks like part or all of the address is being repeated.

This is what I found from comparing 5 records. I think it would be good for CIAT to take a look at some records like this to see if there are any other issues.

Comment entered 2010-01-04 13:22:19 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-01-04 13:22:19
BZCOMMENTOR::Volker Englisch
BZCOMMENT::59

I suggest for CIAT to double-check if these problems have already been corrected on BACH.
Once I have refreshed FRANCK and reran the publishing job I would expect many of these issues to go away.

Comment entered 2010-01-04 13:28:17 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2010-01-04 13:28:17
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::60

When are we going to refresh Franck? It doesn't really make sense for them to spend a lot of time comparing if most of these have been fixed.

Comment entered 2010-01-04 13:35:44 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-01-04 13:35:44
BZCOMMENTOR::Bob Kline
BZCOMMENT::61

(In reply to comment #60)
> When are we going to refresh Franck?

We're waiting for CIAT to finish the review of the test for issue #4725 on Franck (see comment #27 from that issue).

Comment entered 2010-01-04 16:32:39 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-04 16:32:39
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::62

(In reply to comment #61)
> (In reply to comment #60)
> > When are we going to refresh Franck?
>
> We're waiting for CIAT to finish the review of the test for issue #4725 on
> Franck (see comment #27 from that issue).

I have looked at most of the records and found similar issues identified by Margaret above. We had also identified a lot of the data issues and are waiting for conversion on Bach to fix them. With regards to the Syndrome issues, we have done a lot of testing in other issues and have fixed all the problems we found on Bach so I am pretty sure the errors we are find here have been fixed on Bach. However, it will good for us to do additional testing when conversion is completed on Bach.

Volker:
I believe it is OK to refresh Franck at this point. OCECDR-3049 is ready to be promoted to Bach.

Comment entered 2010-01-05 17:14:38 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-01-05 17:14:38
BZCOMMENTOR::Volker Englisch
BZCOMMENT::63

CDR database on FRANCK has been refreshed.

Comment entered 2010-01-07 11:19:33 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-01-07 11:19:33
BZCOMMENTOR::Bob Kline
BZCOMMENT::64

A fresh batch of converted GP documents is on Franck, ready for the next round of publication tests.

Comment entered 2010-01-11 14:58:17 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-01-11 14:58:17
BZCOMMENTOR::Volker Englisch
BZCOMMENT::65

I ran a publishing job on FRANCK using the updated data. The updated documents are now available for review on
http://wwwgk.cancer.gov/search/search_geneticsservices.aspx

14 documents failed validation:
http://franck.nci.nih.gov/cgi-bin/cdr/PubStatus.py?id=6774&type=FilterFailure

The publish-preview is also working no FRANCK.

The data is ready for review on FRANCK.

Comment entered 2010-01-12 10:29:13 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-12 10:29:13
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::66

(In reply to comment #65)
> I ran a publishing job on FRANCK using the updated data. The updated documents
> are now available for review on
> http://wwwgk.cancer.gov/search/search_geneticsservices.aspx
>
> 14 documents failed validation:
>

I looked at some of the records that failed validation. I will have some of them fixed so that you can run another publication job to see if the problems go away.

> http://franck.nci.nih.gov/cgi-bin/cdr/PubStatus.py?id=6774&type=FilterFailure
>
> The publish-preview is also working no FRANCK.
>
> The data is ready for review on FRANCK.

I have looked at some of the records and the problems I found were all data entry problems which we will eventually fix on Bach. I also, found that the issue with Carcinoid Syndrome is still there. Mary and I are looking into fixing that problem both on Franck and Bach soon.

Volker:
I have two questions for you.

In what order are the addresses being displayed (in case of multiple locations)? Whichever comes first (In the CDR Record) as long as they are GP locations? The answer to this question will be helpful when creating a new record or when updating/replacing/adding it.

Also, I saw that for Jessica Y. Adcock, MS record, the organization document (389491) is Inactive and blocked and yet it is displayed (published). Is that a problem or you are already aware of this? In case, you are wondering why we have it in the record, we intend to unblock some, if not all, of the blocked records that are currently linked to GP records. It is part of the cleanup we need to do.

Comment entered 2010-01-12 11:05:24 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-01-12 11:05:24
BZCOMMENTOR::Volker Englisch
BZCOMMENT::67

(In reply to comment #66)
> In what order are the addresses being displayed (in case of multiple
> locations)? Whichever comes first (In the CDR Record) as long as they are GP
> locations?

The filter doesn't specify any order therefore the XML definition is displaying the addresses in document order.

> Also, I saw that for Jessica Y. Adcock, MS record, the organization document
> (389491) is Inactive and blocked and yet it is displayed (published).
> Is that a problem

As you can see it's not a problem in terms of publishing the record.
If this record is meant to be excluded from publishing I don't think I've seen a requirement for that type of restriction. The filters are currently processing everything that's flagged with 'Directory = Include'.

Comment entered 2010-01-12 17:15:26 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-12 17:15:26
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::68

(In reply to comment #66)
> (In reply to comment #65)
> > I ran a publishing job on FRANCK using the updated data. The updated documents
> > are now available for review on
> > http://wwwgk.cancer.gov/search/search_geneticsservices.aspx
> >
> > 14 documents failed validation:
> >
>
> I looked at some of the records that failed validation. I will have some of
> them fixed so that you can run another publication job to see if the problems
> go away.
>

Please run another publishing job for the failed documents only (if that is possible).
Meanwhile, I also have a few questions:
1. For documents 663436 and 663103, they appear to have failed because of the Generational Suffixes they have. Is this correct?

2. For documents 412089, 404148, 269800 etc, they appear to have failed because of the Home Address block in their records. Is this correct? It looks like we need the home address in their records for Board Members.

3. It also appears that professional suffix is required, right? That may have been the reason for the failure of 360777, 330334, 271212 etc.

Comment entered 2010-01-13 11:55:29 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-01-13 11:55:29
BZCOMMENTOR::Volker Englisch
BZCOMMENT::69

(In reply to comment #68)
> Please run another publishing job for the failed documents only

Done.

> 1. For documents 663436 and 663103, they appear to have failed because of the
> Generational Suffixes they have. Is this correct?

Correct. This was a bug in the filter which has been corrected.

> 2. For documents 412089, 404148, 269800 etc, they appear to have failed
> because of the Home Address block in their records. Is this correct?

Correct. This was a bug in the filter which has been corrected.

> 3. It also appears that professional suffix is required, right?

Correct. The DTD defines the DEGREE element as 'Item can appear one or more times', so it is required.

Those formerly failed documents have been pushed to GatekeeperGK and are ready for review.

Comment entered 2010-01-15 11:25:19 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-15 11:25:19
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::70

We have completed the QA for the Gen Prof. documents and everything looks good now as far as Franck is concerned. It looks like Margaret will take a final look before this is promoted.
Thank you!

Comment entered 2010-01-21 12:19:37 by Beckwith, Margaret (NIH/NCI) [E]

BZDATETIME::2010-01-21 12:19:37
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::71

I checked the same 5 records I had looked at before in PUblish Preview. The problem with the syndromes not showing up correctly has been fixed, but all other data issues are still there. I understand that CIAT is going to fix them after conversion. They will need to look at every record I think. I do have a question about the dash after the 5 digit zip code. Is that really in the data?

Comment entered 2010-01-21 16:26:28 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-01-21 16:26:28
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::72

In the CDR Meeting today, we agreed to do the conversion of the Gen Prof documents next week. However, the publication scripts should not be installed on Bach until a later date when the entire data cleanup is completed.

Comment entered 2010-01-28 13:44:30 by Kline, Bob (NIH/NCI) [C]

BZDATETIME::2010-01-28 13:44:30
BZCOMMENTOR::Bob Kline
BZCOMMENT::73

According to William, the vendor output filter is putting out LegacyData elements, which aren't permitted by the DTD. This is causing publish preview to fail (and of course, will cause real publication events to fail).

Comment entered 2010-01-28 17:22:43 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-01-28 17:22:43
BZCOMMENTOR::Volker Englisch
BZCOMMENT::74

(In reply to comment #73)
> According to William, the vendor output filter is putting out LegacyData
> elements, which aren't permitted by the DTD.

Publish preview worked last night and William had closed the PP issue.
Do we have an example document?
My training is over tomorrow around lunch time and I'll have a look at the issue at that time.

Comment entered 2010-02-01 16:51:36 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-01 16:51:36
BZCOMMENTOR::Volker Englisch
BZCOMMENT::75

(In reply to comment #73)
> According to William, the vendor output filter is putting out LegacyData
> elements, which aren't permitted by the DTD.

William, if you could please give me additional information (i.e. a document with a problem). Since publishing ran successfully on FRANCK with data from BACH I'm not sure were we're having a problem.

Comment entered 2010-02-01 17:03:40 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-01 17:03:40
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::76

(In reply to comment #75)
> (In reply to comment #73)
> > According to William, the vendor output filter is putting out LegacyData
> > elements, which aren't permitted by the DTD.
>
> William, if you could please give me additional information (i.e. a document
> with a problem). Since publishing ran successfully on FRANCK with data from
> BACH I'm not sure were we're having a problem.

Example - CDR0000663575 on Franck.

The problem has to do with making the ID element of the Legacy data block required for new documents. The new documents do not have any legacy data. When their mailers are generated, we may include their username and password in the Legacy Data block. But until then we do not have any information for them to put in the Legacy Data block.

Also, I closed the publish preview issue because I wanted to open another issue for enhancement and other problems we find while fixing the errors on Bach. Should I re-open it instead?

Comment entered 2010-02-01 17:43:25 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-01 17:43:25
BZCOMMENTOR::Volker Englisch
BZCOMMENT::77

(In reply to comment #76)
> The problem has to do with making the ID element of the Legacy data block
> required for new documents. The new documents do not have any legacy data.
> When their mailers are generated, we may include their username and
> password in the Legacy Data block. But until then we do not have any
> information for them to put in the Legacy Data block.

This explanation of the vendor filter behavior is correct but is opposite to what was reported in comment #73.
I'm assuming that the explanation from comment #76 describes the current behavior, namely to output a "legacy" element because it is required by the DTD.

It appears that we never tested publishing a new document using the updated filters. New documents are created without the mandatory ID element at this time and that will need to be fixed.

> Also, I closed the publish preview issue because I wanted to open another
> issue for enhancement and other problems we find while fixing the errors
> on Bach.
> Should I re-open it instead?

No, for enhancements please open a new issue.

Comment entered 2010-02-01 18:02:24 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-01 18:02:24
BZCOMMENTOR::Volker Englisch
BZCOMMENT::78

The vendor filter has been fixed on MAHLER and FRANCK:
CDR559215.xml - R9482: Vendor Filter: GeneticsProfessional

This is ready for review on FRANCK.

Comment entered 2010-02-08 21:10:09 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-08 21:10:09
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::79

(In reply to comment #78)
> The vendor filter has been fixed on MAHLER and FRANCK:
> CDR559215.xml - R9482: Vendor Filter: GeneticsProfessional
> This is ready for review on FRANCK.

It Works without a problem on Mahler (that is, when no Legacy information present). However, on Franck, I am unable to get pub prev. to work without the Legacy information. I get the following error:
"CDRPreview web service error: The element 'GENETICSPROFESSIONAL' has invalid child element 'NAME'. List of possible elements expected: 'ID'.Validation error occurred when validating the instance document.,44,2 "

When I add the Legacy Data, then I am able to get pub preview. Example – 665455 on Franck.

Comment entered 2010-02-09 01:37:21 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-09 01:37:21
BZCOMMENTOR::Volker Englisch
BZCOMMENT::80

(In reply to comment #79)
> (In reply to comment #78)
> It Works without a problem on Mahler (that is, when no Legacy information
> present). However, on Franck, I am unable to get pub prev. to work without the
> Legacy information.

It worked when the fix had been implemented on Feb. 1st. However, since then we had to have FRANCK refreshed to test OCECDR-3074 which reverted the filters back to the version on BACH.

Comment entered 2010-02-09 09:32:54 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-09 09:32:54
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::81

(In reply to comment #80)
> (In reply to comment #79)
> > (In reply to comment #78)
> > It Works without a problem on Mahler (that is, when no Legacy information
> > present). However, on Franck, I am unable to get pub prev. to work without the
> > Legacy information.
> It worked when the fix had been implemented on Feb. 1st. However, since then
> we had to have FRANCK refreshed to test OCECDR-3074 which reverted the filters
> back to the version on BACH.

You're right! I forgot about the recent Franck refresh. Please promote the fix on Mahler to Bach. Thanks!

Comment entered 2010-02-17 18:04:02 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-17 18:04:02
BZCOMMENTOR::Volker Englisch
BZCOMMENT::82

I've talked with Blair about this problem.
I had modified the filter to populate the ID field with the CDR-ID if a LegacyID didn't existed but Bob suggested to check with Blair if we could drop the ID element completely for those documents, which would require a DTD change.

According to Blair the ID element is not used at all by Cancer.gov/Gatekeeper (but we don't know if this is true for the licensees). So we could possibly change the DTD and drop the ID element instead of artificially populating it.

Comment entered 2010-02-18 10:38:07 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-18 10:38:07
BZCOMMENTOR::Volker Englisch
BZCOMMENT::83

I have some notes from our last status meeting referring to missing elements from the vendor filter (TollFree Number, Service Limitation, Public=No).

William, could you please give me some more information on this?

Comment entered 2010-02-18 10:46:35 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-18 10:46:35
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::84

(In reply to comment #83)
> I have some notes from our last status meeting referring to missing elements
> from the vendor filter (TollFree Number, Service Limitation, Public=No).
>
> William, could you please give me some more information on this?

Sure -

Sure -
1. Toll free numbers don’t show up in publish preview
2. Phone numbers show up regardless of their attributes (Public or not)
3. Publish preview displays emails regardless of attributes (Public or not)
4. Service Limitation not displaying in publish preview - Example 664951

I know what your next question will be. Provide sample documents :-). I will provide one example for each of the above issues soon.

Comment entered 2010-02-18 10:51:19 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-18 10:51:19
BZCOMMENTOR::Volker Englisch
BZCOMMENT::85

(In reply to comment #84)
> 1. Toll free numbers don’t show up in publish preview

We don't display a TollFreeNumber per se in the vendor output, just a phone number so I'm guessing we need a rule of when to display a TollFreeNumber and when not to. Which phone number should be displayed if multiple ones exist?

> 2. Phone numbers show up regardless of their attributes (Public or not)
> 3. Publish preview displays emails regardless of attributes (Public or not)
> 4. Service Limitation not displaying in publish preview - Example 664951
>
> I know what your next question will be. Provide sample documents :-). I will
> provide one example for each of the above issues soon.

No, I don't need an example for 1-3 and you gave me one for item (4).

Comment entered 2010-02-19 12:45:49 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-19 12:45:49
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::86

Another vendor filter change that we may have to make is for the use of PrivatePractice locations. Some of the addresses we are seeing are private practices and not organizations. However, it looks like you may need to make changes to the vendor filter to be able to publish these to cancer.gov.

Comment entered 2010-02-23 20:46:12 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-23 20:46:12
BZCOMMENTOR::Volker Englisch
BZCOMMENT::87

(In reply to comment #84)
> 4. Service Limitation not displaying in publish preview - Example 664951

Service Limitation doesn't appear to be an element in the DTD.
I'm not really sure what this field maps to?

I've fixed the Public=No problem for the SpecificPhone and SpecificEmail fields and I'm now including the TollFreePhone.

Comment entered 2010-02-25 14:25:06 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-25 14:25:06
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::88

(In reply to comment #87)
> (In reply to comment #84)
> > 4. Service Limitation not displaying in publish preview - Example 664951
>
> Service Limitation doesn't appear to be an element in the DTD.
> I'm not really sure what this field maps to?
>

The Service Limitation element is referred to as "NOTES" in the DTD. This matches the information in the record for CDR0000664951 and its display on cancer.gov
http://www.cancer.gov/search/view_geneticspro.aspx?personid=556197

Comment entered 2010-02-25 15:09:19 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-25 15:09:19
BZCOMMENTOR::Volker Englisch
BZCOMMENT::89

(In reply to comment #84)
> 4. Service Limitation not displaying in publish preview - Example 664951

Fixed on MAHLER.

Comment entered 2010-02-25 17:42:08 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-25 17:42:08
BZCOMMENTOR::Volker Englisch
BZCOMMENT::90

(In reply to comment #86)
> Another vendor filter change that we may have to make is for the use of
> PrivatePractice locations.

For the GP address the INSTITUTION field is mandatory but we don't list an "institution" for the PrivatePractice.
How should we deal with this situation?

Comment entered 2010-02-26 13:01:01 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-26 13:01:01
BZCOMMENTOR::Volker Englisch
BZCOMMENT::91

I've modified the filter to allow PrivatePractice locations to be displayed in the vendor output.
We still need to decide how to handle the mandatory Institution element in the vendor output for those addresses.

Comment entered 2010-02-26 15:09:55 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-02-26 15:09:55
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::92

(In reply to comment #89)
> (In reply to comment #84)
> > 4. Service Limitation not displaying in publish preview - Example 664951
>
> Fixed on MAHLER.

Please promote these changes to Bach as it is difficult to test with data on Mahler.

Comment entered 2010-02-26 18:42:24 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-02-26 18:42:24
BZCOMMENTOR::Volker Englisch
BZCOMMENT::93

(In reply to comment #92)
> Please promote these changes to Bach as it is difficult to test with data on
> Mahler.

This is typically not a sufficient argument to test on a production system.
However, the new changes only affect the GP documents which aren't being published at the moment and the global module (CDR315588) had already been promoted for OCECDR-2896 when it shouldn't have.

The following filter has been copied to BACH:
CDR559215 - R9505: Vendor Filter: GeneticsProfessional

Comment entered 2010-03-01 12:12:26 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-01 12:12:26
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::94

(In reply to comment #93)
> (In reply to comment #92)
> > Please promote these changes to Bach as it is difficult to test with data on
> > Mahler.
>
> This is typically not a sufficient argument to test on a production system.
> However, the new changes only affect the GP documents which aren't being
> published at the moment and the global module (CDR315588) had already been
> promoted for OCECDR-2896 when it shouldn't have.
>
> The following filter has been copied to BACH:
> CDR559215 - R9505: Vendor Filter: GeneticsProfessional

1. The emails issue has been solved
2. The Service Limitation issue has also been solved.
3. The private practice location is now displaying but phone numbers and email addresses don’t display.

Comment entered 2010-03-02 15:47:31 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-02 15:47:31
BZCOMMENTOR::Volker Englisch
BZCOMMENT::95

(In reply to comment #94)
> 3. The private practice location is now displaying but phone numbers and email
> addresses don’t display.

Could you give me a sample on BACH that I could look at? I tested it on MAHLER and the phone and email does display for the PrivatePractice location.

Comment entered 2010-03-02 15:50:23 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-02 15:50:23
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::96

(In reply to comment #95)
> (In reply to comment #94)
> > 3. The private practice location is now displaying but phone numbers and email
> > addresses don’t display.
>
> Could you give me a sample on BACH that I could look at? I tested it on MAHLER
> and the phone and email does display for the PrivatePractice location.

Here is one - CDR0000664785

Comment entered 2010-03-02 17:36:11 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-02 17:36:11
BZCOMMENTOR::Volker Englisch
BZCOMMENT::97

The problem with the email and phone not showing for the PrivatePractice locations has been fixed.
CDR559215 - R9505: Vendor Filter: GeneticsProfessional

This is ready for review on MAHLER.

Comment entered 2010-03-03 09:37:17 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-03 09:37:17
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::98

(In reply to comment #97)
> The problem with the email and phone not showing for the PrivatePractice
> locations has been fixed.
> CDR559215 - R9505: Vendor Filter: GeneticsProfessional
>
> This is ready for review on MAHLER.

The phone number shows up but not the email address. Tested with 657322 on Mahler.

Comment entered 2010-03-03 09:42:32 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-03 09:42:32
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::99

(In reply to comment #98)
> (In reply to comment #97)
> > The problem with the email and phone not showing for the PrivatePractice
> > locations has been fixed.
> > CDR559215 - R9505: Vendor Filter: GeneticsProfessional
> >
> > This is ready for review on MAHLER.
>
> The phone number shows up but not the email address. Tested with 657322 on
> Mahler.

I take it back. It shows up correctly. Sorry! The document had a validation error and once I fixed it, the email showed up. Please promote to Bach.

Comment entered 2010-03-04 15:11:07 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-04 15:11:07
BZCOMMENTOR::Volker Englisch
BZCOMMENT::100

Per discussion at today's status meeting, CIAT will enter a few new GP Person documents on BACH and let me know when that has been done.
The following day, after the nightly backup took a copy of the CDR, I will refresh the CDR on FRANCK and run a publishing job to the GatekeeperGK server.
The result of this publishing job should be very close to the final documents to be published on BACH after the cleanup has been completed.

Comment entered 2010-03-11 10:28:51 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-11 10:28:51
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::101

Please promote the current changes to Bach. Also, I just noticed that (In reply to comment #99)
> (In reply to comment #98)

> I take it back. It shows up correctly. Sorry! The document had a validation
> error and once I fixed it, the email showed up. Please promote to Bach.

Please promote the above changes to Bach.

Also, I just noticed that pub preview fails because of the presence of the SpecificWebSite element. Can you modify the software to ignore this element or modify it such that it does not cause pub preview to fail? I know we do not display the web site info on cancer.gov but when the conversion was done, all the records with web site data, were converted into the SpecificWebSite element and we need to keep the history.

Comment entered 2010-03-11 10:38:45 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-11 10:38:45
BZCOMMENTOR::Volker Englisch
BZCOMMENT::102

(In reply to comment #101)
> Also, I just noticed that pub preview fails because of the presence of the
> SpecificWebSite element. Can you modify the software to ignore this element or
> modify it such that it does not cause pub preview to fail?

I made the changes to include the SpecificWebSite element yesterday because Bob needed the information for the mailers but I haven't finished the change to remove the element again from the vendor output.

Comment entered 2010-03-12 13:50:36 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-12 13:50:36
BZCOMMENTOR::Volker Englisch
BZCOMMENT::103

I've made changes to remove the SpecificFax/WebSite from the vendor output and I'm in the process of running a publishing job on FRANCK to diff the output.

Comment entered 2010-03-15 09:58:16 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-15 09:58:16
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::104

1.For Canadian addresses (will be true for UK and Australia addresses also), the Provinces are shortened to two letters like US states, for example, ‘ON’ for Ontario. Because it is on the same line as the zip code it looks as if it is part of the zip code. For example:

Cambridge Memorial Hospital
700 Coronation Boulevard
Cambridge, ON N1R 3G2
Canada
e-mail:

I think it will be good to spell out the provinces completely.

2.Also, whenever a professional does not have an email, there is an empty 'e-mail ' tag (like above). I think it will be good to also not display the e-mail tag when an email address is not provided.

Comment entered 2010-03-15 12:09:16 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-15 12:09:16
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::105

We have finished the Gen. Prof. Cleanup so we can proceed with the next steps.

Comment entered 2010-03-15 12:10:09 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-15 12:10:09
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::106

(In reply to comment #105)
> We have finished the Gen. Prof. Cleanup so we can proceed with the next steps.

I forgot to mention that I have entered 3 new applications/records as well.

Comment entered 2010-03-15 12:12:06 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-15 12:12:06
BZCOMMENTOR::Volker Englisch
BZCOMMENT::107

(In reply to comment #105)
> we can proceed with the next steps.

The next step was to wait until tomorrow so that we can use tonight's backup file from BACH to refresh the CDR database on FRANCK and then run a GP publishing job.

Comment entered 2010-03-15 13:25:25 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-15 13:25:25
BZCOMMENTOR::Volker Englisch
BZCOMMENT::108

(In reply to comment #104)
> 1.For Canadian addresses
[...]
>
> Cambridge Memorial Hospital
> 700 Coronation Boulevard
> Cambridge, ON N1R 3G2
> Canada
> e-mail:
>
> I think it will be good to spell out the provinces completely.

I've browsed the web and it appears there are three commonly used versions of how to list the City/Province/Postal code for Canadian addresses
a) The "American" version
Cambridge, ON N1R 3G2
b) The "technically correct" version
Cambridge
ON, N1R 3G2
c) The "I-don't-know-what-to-call-it" version
Cambridge, ON
N1R 3G2
None of these version that I've found on the Internet have the province spelled out.

In addition, all of our addresses are created using a global template. This means that any change to the format of the address block will be reflected on any other address throughout the CDR.
For a change of this magnitude I would prefer for Margaret or Lakshmi to comment.

As an alternative I could create a PostalAddress template that's only used by GP documents.

Comment entered 2010-03-15 13:57:31 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-15 13:57:31
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::109

I just wanted to add that we are currently spelling it out on cancer.gov

http://www.cancer.gov/search/view_geneticspro.aspx?personid=556013

But when we publish them anew, they won't be spelled out anymore.

Comment entered 2010-03-15 14:07:18 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-15 14:07:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::110

(In reply to comment #109)
> I just wanted to add that we are currently spelling it out on cancer.gov

That's correct. We also spell the city in capital letters and we won't be doing that anymore.

As I said, I would like Lakshmi and Margaret give guidance.
My personal preference is for consistency across all document types.

By the way, I checked with a Canadian native and she said she would format the address this way:
Ms. John Doe
President
123 E. Kensington St
North Vancouver, BC
V17 1P2
CANADA

Comment entered 2010-03-15 14:08:40 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-15 14:08:40
BZCOMMENTOR::Volker Englisch
BZCOMMENT::111

(In reply to comment #104)
> 2.Also, whenever a professional does not have an email, there is an empty
> 'e-mail ' tag (like above).

This has been fixed on MAHLER.

Comment entered 2010-03-15 14:52:55 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-15 14:52:55
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::112

(In reply to comment #111)
> (In reply to comment #104)
> > 2.Also, whenever a professional does not have an email, there is an empty
> > 'e-mail ' tag (like above).
>
> This has been fixed on MAHLER.

Verified on Mahler. Please promote to Bach.

Comment entered 2010-03-15 15:41:15 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-15 15:41:15
BZCOMMENTOR::Volker Englisch
BZCOMMENT::113

The following filters have been copied to FRANCK and BACH:
CDR315588 - R9524: Module: Vendor Cleanup Templates
CDR559215 - R9524: Vendor Filter: GeneticsProfessional

I ran a diff report before on FRANCK and the only changes identified were the 72 term documents that were effected by the filter change in OCECDR-3102.

This is ready to be verified on BACH.

Comment entered 2010-03-16 16:22:30 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-16 16:22:30
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::114

(In reply to comment #113)
> The following filters have been copied to FRANCK and BACH:
> CDR315588 - R9524: Module: Vendor Cleanup Templates
> CDR559215 - R9524: Vendor Filter: GeneticsProfessional
>
> I ran a diff report before on FRANCK and the only changes identified were the
> 72 term documents that were effected by the filter change in OCECDR-3102.
>
> This is ready to be verified on BACH.

(In reply to comment #113)
> The following filters have been copied to FRANCK and BACH:
> CDR315588 - R9524: Module: Vendor Cleanup Templates
> CDR559215 - R9524: Vendor Filter: GeneticsProfessional
>
> I ran a diff report before on FRANCK and the only changes identified were the
> 72 term documents that were effected by the filter change in OCECDR-3102.
>
> This is ready to be verified on BACH.

Do you mean, I should verify the email changes reported in comment 112?

Comment entered 2010-03-17 10:12:03 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-17 10:12:03
BZCOMMENTOR::Volker Englisch
BZCOMMENT::115

At this point you can probably just wait until the publishing to GatekeeperGK finished at the end of the day and preview the result on the Cancer.gov test server.

Comment entered 2010-03-18 10:32:18 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-18 10:32:18
BZCOMMENTOR::Volker Englisch
BZCOMMENT::116

The latest GP data has been loaded to the GatekeeperGK test server and is ready for review:
http://wwwgk.cancer.gov/search/geneticsservices/

I noticed that several GPs are listed double on the GK server. This is due to the fact that we've submitted the GPs twice with different CDR-IDs. Once from MAHLER (or old FRANCK) and now once from FRANCK (refreshed data from BACH) with a different CDR-ID.
One can identify the latest one to be QC'ed as follows:
After searching for a name hover the mouse over the link to the person. The link will display a 'personid'. The link with the higher personid (a.k.a. CDR-ID) is the one most recently pushed to the GK server.

Example:
For Mary J. Ahrens there exist two links

Comment entered 2010-03-19 14:59:05 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-19 14:59:05
BZCOMMENTOR::Volker Englisch
BZCOMMENT::117

I've submitted the latest Cancer.gov DTD to Blair. This DTD makes the ID element optional.

Comment entered 2010-03-19 14:59:05 by Englisch, Volker (NIH/NCI) [C]

Attachment pdqCG.dtd has been added with description: Cancer.gov DTD

Comment entered 2010-03-23 11:07:38 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-23 11:07:38
BZCOMMENTOR::Volker Englisch
BZCOMMENT::118

FYI:
There were three new GP documents (not converted) for which we deleted the LegacyGeneticsData block. Those documents were successfully published to the GatekeeperGK test system.

Comment entered 2010-03-23 11:29:12 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-23 11:29:12
BZCOMMENTOR::Volker Englisch
BZCOMMENT::119

Please note that we haven't answered the question of comment #90 on what to display for the mandatory INSTITUTION element in the case of a PrivatePractice location.

Comment entered 2010-03-23 12:35:31 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-23 12:35:31
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::120

(In reply to comment #116)
> The latest GP data has been loaded to the GatekeeperGK test server and is ready
> for review:
> http://wwwgk.cancer.gov/search/geneticsservices/
> the data of this Person's record in the CDR.

We have finished testing. We did not see any problems with the data so we are good to go.

There are two minor issues that we can address later:

1. A few records (probably no more than 2) have values in the <PersonTitle> element but this is not captured in the vendor filter. For example : 3766 & 664946.

2. There seems to be slight a discrepancy between pub preview and what is on test site for at least one Canadian record- CDR0000664881. The email and phone number are both public in the record. However, in pub preview, the email does not display but the email and phone number correctly display on the test site.
http://wwwgk.cancer.gov/search/view_geneticspro.aspx?personid=664881

Comment entered 2010-03-24 10:25:25 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-24 10:25:25
BZCOMMENTOR::Volker Englisch
BZCOMMENT::121

(In reply to comment #120)
> 1. A few records (probably no more than 2) have values in the <PersonTitle>
> element but this is not captured in the vendor filter. For example : 3766 &
> 664946.

For CDR3766 I don't see the PersonTitle listed in the record on Cancer.gov or wwwGK.cancer.gov.
For CDR664946 a PersonTitle does exist but it had been included as part of the address block since the DTD doesn't have any element for a person title.
Is this what we would want to do with the PersonTitle element to stuff it into the address block?

> 2. There seems to be slight a discrepancy between pub preview and what is on
> test site for at least one Canadian record- CDR0000664881. The email and phone
> number are both public in the record. However, in pub preview, the email does
> not display but the email and phone number correctly display on the test site.
> http://wwwgk.cancer.gov/search/view_geneticspro.aspx?personid=664881

Did you run PP on BACH or on FRANCK? The data on the wwwGK test site has been created from FRANCK. Also, the vendor filters on FRANCK are not up-to-date.
You should only compare the data on the test site with the data/reports on FRANCK.

Comment entered 2010-03-24 11:14:53 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-03-24 11:14:53
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::122

(In reply to comment #121)
> (In reply to comment #120)
> > 1. A few records (probably no more than 2) have values in the <PersonTitle>
> > element but this is not captured in the vendor filter. For example : 3766 &
> > 664946.
>
> For CDR3766 I don't see the PersonTitle listed in the record on Cancer.gov or
> wwwGK.cancer.gov.
> For CDR664946 a PersonTitle does exist but it had been included as part of the
> address block since the DTD doesn't have any element for a person title.
> Is this what we would want to do with the PersonTitle element to stuff it into
> the address block?
>
Yes. We will want it displayed before the address information (Just as it is in the CDR). We can discuss this later, as I said; only about 2 records were affected so I can include this any vendor filter changes we make in the future.

> > 2. There seems to be slight a discrepancy between pub preview and what is on
> > test site for at least one Canadian record- CDR0000664881. The email and phone
> > number are both public in the record. However, in pub preview, the email does
> > not display but the email and phone number correctly display on the test site.
> > http://wwwgk.cancer.gov/search/view_geneticspro.aspx?personid=664881
>
> Did you run PP on BACH or on FRANCK? The data on the wwwGK test site has been
> created from FRANCK. Also, the vendor filters on FRANCK are not up-to-date.
> You should only compare the data on the test site with the data/reports on
> FRANCK.

You're right. When I compared with pup preview on Franck, there was no discrepancy.

Comment entered 2010-03-24 13:47:10 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-24 13:47:10
BZCOMMENTOR::Volker Englisch
BZCOMMENT::123

It looks like everything is ready to go for the publishing of the GeneticsProfessional Persons on BACH.

We though it might be good to go over the steps to be done during our status meeting and then publish the GP documents after the nightly publishing finished on Thursday.

Comment entered 2010-03-25 19:12:21 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-25 19:12:21
BZCOMMENTOR::Volker Englisch
BZCOMMENT::124

FYI:
Regarding our discussion this afternoon if it's possible to process the GP update and GP removal of the old documents at the same time this is actually not possible. The publishing software doesn't allow to process updates and removals at the same time when processing a single document type.
Removals and updates are processed as part of the Export where the removals are indicated by documents that are blocked.

Comment entered 2010-03-25 21:08:45 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-25 21:08:45
BZCOMMENTOR::Volker Englisch
BZCOMMENT::125

The following filter has been copied to BACH:
CDR559215 - R9553: Vendor Filter: GeneticsProfessional

The modified publishing document has been stored in the CDR. This document allows the Person document type to be published as GeneticsProfessional documents:
CDR178 - V51

The following program has been copied to BACH to allow Person document (which were formerly suppressed) to be pushed to Cancer.gov as GP documents:
cdrpub.py - R9458

I ran a Hotfix-Remove request for the old 535 document and stopped publishing on Gatekeeper at the Preview stage.
Then I ran the Export-GeneticsProfessional publishing job to create the new 538 Person/GP documents and also stopped publishing on Gatekeeper at the Preview stage. Once the result of both job were verified I manually pushed the remove request from the preview site to the live site and then submitted the load of the new documents. There was a period of about 5-8 minutes when there were no GP documents available on Cancer.gov between both of these jobs.

Everything worked without a problem but I noticed that the GP names are missing a space between the middle initial and the last name of a person's name.
This had already been fixed on the Gatekeeper test server and I asked Blair and Mini to have this change moved to production.

We should leave this issue open until the weekly publishing job finished properly.

Comment entered 2010-03-26 14:14:11 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-03-26 14:14:11
BZCOMMENTOR::Volker Englisch
BZCOMMENT::126

(In reply to comment #125)
> Everything worked without a problem but I noticed that the GP names are missing
> a space between the middle initial and the last name of a person's name.

FYI: The spacing problem on Cancer.gov has been fixed.

Comment entered 2010-04-01 10:06:19 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-04-01 10:06:19
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::127

I have looked at a lot of the published documents on cancer.gov and did not see any problems. However, when searching by the Family Cancer menu items only, some of them do not retrieve any documents but I am sure there are documents that have been assigned these syndromes.

Has this got to do with the way the term documents were set up? For example, on cancer.gov when you select Adenomatous polyposis, no documents are retrieved. Meanwhile, the name of the syndrome in the CDR is familial adenomatous polyposis (CDR0000042839) and the display name is Adenomatous polyposis, familial. The cancer type items appear to work fine.

Comment entered 2010-04-01 10:12:17 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-04-01 10:12:17
BZCOMMENTOR::Volker Englisch
BZCOMMENT::128

Is this something that was working with the test load on GatekeeperGK?

Comment entered 2010-04-01 10:18:15 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-04-01 10:18:15
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::129

(In reply to comment #128)
> Is this something that was working with the test load on GatekeeperGK?

I do not remember testing this on GatekeeperGK. I concentrated more on data problems on GK and only searched by the names of the professionals.

Comment entered 2010-04-01 16:55:57 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-04-01 16:55:57
BZCOMMENTOR::Volker Englisch
BZCOMMENT::130

The latest problem with the search by syndrome names (when the names had changed) has been resolved on Cancer.gov.
There was a table that is build on Gatekeeper including the new names but that table did not update a similar table on Cancer.gov. The update has been performed manually by Min.

Comment entered 2010-04-06 13:26:53 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-04-06 13:26:53
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::131

(In reply to comment #130)
> The latest problem with the search by syndrome names (when the names had
> changed) has been resolved on Cancer.gov.
> There was a table that is build on Gatekeeper including the new names but that
> table did not update a similar table on Cancer.gov. The update has been
> performed manually by Min.

I checked this on cancer.gov and everything seems to be working well. It looks like we can close this issue, right?

Comment entered 2010-04-06 13:47:51 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-04-06 13:47:51
BZCOMMENTOR::Volker Englisch
BZCOMMENT::132

Margaret noticed one more minor problem with the display on Cancer.gov:

After the Additional Information at the end of some records there is no space between that label and the text. (see Daly, Mary as an example).

I'll have to report this to Blair to get fixed but I don't know at this point if there should be a space or a newline after the heading.
Do you know?

Comment entered 2010-04-06 14:19:11 by Osei-Poku, William (NIH/NCI) [C]

BZDATETIME::2010-04-06 14:19:11
BZCOMMENTOR::William Osei-Poku
BZCOMMENT::133

(In reply to comment #132)
> Margaret noticed one more minor problem with the display on Cancer.gov:
>
> After the Additional Information at the end of some records there is no space
> between that label and the text. (see Daly, Mary as an example).
>
> I'll have to report this to Blair to get fixed but I don't know at this point
> if there should be a space or a newline after the heading.
> Do you know?

There's supposed to be a space. We have copies of some of the documents before conversion.

Comment entered 2010-04-08 15:58:05 by Englisch, Volker (NIH/NCI) [C]

BZDATETIME::2010-04-08 15:58:05
BZCOMMENTOR::Volker Englisch
BZCOMMENT::134

All vendor filter changes have been taken care of.

The spacing issue will be addressed by the Cancer.gov team.

Closing issue.

Attachments
File Name Posted User
CDR556157.xml 2009-11-23 16:57:23 Englisch, Volker (NIH/NCI) [C]
CDR828.xml 2009-11-23 16:58:40 Englisch, Volker (NIH/NCI) [C]
GenProf_19859_Vendor.xml 2009-11-12 15:35:43 Englisch, Volker (NIH/NCI) [C]
pdqCG.dtd 2010-03-19 14:59:05 Englisch, Volker (NIH/NCI) [C]

Elapsed: 0:00:00.001481