Issue Number | 3148 |
---|---|
Summary | [HP Summary Section] Programmatic population of Affiliation elements |
Created | 2010-05-13 09:40:28 |
Issue Type | Improvement |
Submitted By | Beckwith, Margaret (NIH/NCI) [E] |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2010-08-30 11:23:23 |
Resolution | Fixed |
Path | /home/bkline/backups/jira/ocecdr/issue.107476 |
BZISSUE::4835
BZDATETIME::2010-05-13 09:40:28
BZCREATOR::Margaret Beckwith
BZASSIGNEE::Bob Kline
BZQACONTACT::William Osei-Poku
I wasn't sure what component to use for this. We are adding two elements to the PDQ Board Member Info records called AffiliationName and AffiliationPlace. These are text elements that will need to be populated using the Board roster information found on Cancer.gov. We can discuss what needs to be done for this at a CDR meeting.
BZDATETIME::2010-05-13 10:28:14
BZCOMMENTOR::Bob Kline
BZCOMMENT::1
Changed target version to CDR 1.0.
BZDATETIME::2010-05-14 12:41:11
BZCOMMENTOR::Bob Kline
BZCOMMENT::2
There are two board members without any affiliation name:
R. Beverly Raney (pediatric treatment, in Austin)
Jean Fourcroy (screening & prevention, Bethesda)
Want to track down affiliation names for these two?
BZDATETIME::2010-05-14 12:49:32
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::3
Sure, and I will ask the Board Managers to get Cancer.gov updated as well.
BZDATETIME::2010-05-14 13:48:55
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::4
For Bev Raney use: Dell Children's Medical Center of Central Texas
Jean Fourcroy is an interesting case because we don't have any kind of professional information for her; all we have is a home address. Maybe she's retired. I have asked Val to see if she has an affiliation we can use. Otherwise, we may need to leave this optional.
BZDATETIME::2010-05-17 11:53:03
BZCOMMENTOR::Bob Kline
BZCOMMENT::5
I have extracted all of the information we'll need for populating the Affiliation blocks in the PDQBoardMemberInfo documents. I have stored the results temporarily in this XML document:
http://mahler.nci.nih.gov/PDQBoardMembers.xml
The structure of the Affiliations block for each board member reflects what I'm thinking might work best for both the board directories on cancer.gov (in the event that we abandon manual maintenance of those rosters for something driven by the CDR data) as well as the list of summary maintainers to be populated in each exported summary at publication time. The script to extract the information was fed canned data for the handful of board members which didn't have the simple two-line (name + place) format on cancer.gov, and the rest were handled automatically. I've made AffiliationName a multiply-occurring child of Affiliation in order to handle the MCV/VCU case correctly, and I'm using the Usage attribute to distinguish Affiliation blocks which aren't to be used for both types of export ('BD' means just publish for the board directory roster on cancer.gov, 'SR' means include when publishing the list of reviewers for the summary). You'll see an example of this for Janet Dancey, who has a custom line for display in the list of summary reviewers, and two separate affiliations for the board directory on cancer.gov.
Here's the output of a second script to emulate what the summary publication filters would do with the information to generate the list of reviewers:
http://mahler.nci.nih.gov/PDQBoardReviewers.html
Note that when multiple lines are used for a single affiliation on cancer.gov, they appear here on a single line, separated by a comma. Similarly, when there are more than one affiliation for a board member on cancer.gov the list of reviewers show them on a single line, separated by ampersands. Both situations can be overridden for individual board members by creating separate affiliation blocks marked appropriately with the Usage attribute (as I did for Dancey). When there is no affiliation name for a board member (as in the case of Jean Fourcroy) the parentheses following the name are omitted.
Let's discuss whether this approach will handle all of the cases which might arise, and if so I'll update the schema for task #4834 accordingly, and then proceed with updating the documents on Mahler.
BZDATETIME::2010-05-17 12:02:22
BZCOMMENTOR::Bob Kline
BZCOMMENT::6
One additional twist came to mind that I thought I'd better ask about: is it possible that the same individual would serve on more than one board, and ask that his or her affiliations be listed differently for each board?
BZDATETIME::2010-05-17 12:24:04
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::7
We do have a couple of members who are on more than one Board. Barry Kramer is on the Adult Treatment and the Screening and Prevention, and I believe Mary Daly is on the Genetics Board and also on the S & P Advisory Board. I don't think that there would ever be a case where they would have different affiliations for the different Boards.
But I do want to make sure that if we ever decided to display the names/affiliations of our Advisory Board members on Cancer.gov we could. It seems like that wouldn't be a problem since there is a block for each Board they are on, but I just thought I would ask.
BZDATETIME::2010-05-17 13:17:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::8
(In reply to comment #7)
> We do have a couple of members who are on more than one Board.
Barry Kramer is
> on the Adult Treatment and the Screening and Prevention, and I
believe Mary
> Daly is on the Genetics Board and also on the S & P Advisory
Board. I don't
> think that there would ever be a case where they would have
different
> affiliations for the different Boards.
>
> But I do want to make sure that if we ever decided to display
the
> names/affiliations of our Advisory Board members on Cancer.gov we
could. It
> seems like that wouldn't be a problem since there is a block for
each Board
> they are on, but I just thought I would ask.
Should work fine, as long as you're certain affiliation information won't ever change based on which boards they're members of.
One final gap to address: you'll notice that in some cases the form of the name differs between the roster on Cancer.gov and what you see in http://mahler.nci.nih.gov/PDQBoardReviewers.html. I'm pulling from the personal name information stored in the Person documents. If we wanted to be able to replicate the name forms maintained for the roster on cancer.gov I could add an optional DisplayName element. With that I believe we'd able to replicate programmatically exactly what's on Cancer.gov without dropping anything on the floor. Is this something we should we add?
BZDATETIME::2010-05-17 13:37:39
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::9
>If we wanted to be
> able to replicate the name forms maintained for the roster on
cancer.gov I
> could add an optional DisplayName element. With that I believe we'd
able to
> replicate programmatically exactly what's on Cancer.gov without
dropping
> anything on the floor. Is this something we should we add?
Yes, I think that is a good idea.
BZDATETIME::2010-05-17 13:46:01
BZCOMMENTOR::Bob Kline
BZCOMMENT::10
OK, I'll go ahead and add that to the schema, and populate it with what I extract from the rosters on cancer.gov. Unless you tell us otherwise, we won't use those customized forms of the names for the list of reviewers exported with the summaries.
BZDATETIME::2010-05-17 14:01:16
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::11
It seems like we would want the names on the Cancer.gov roster and in the summary to match. Maybe we should talk about this and agree that the name in the Person record should match the name in the roster should match the name as a summary reviewer? This is getting complicated (as usual).
BZDATETIME::2010-05-17 14:22:36
BZCOMMENTOR::Bob Kline
BZCOMMENT::12
I guess there are three approaches you could take on the question of the form of the names:
(1) Put the optional element in the schema. Populate it when the
name
differs from what would display using the data in the Person
document.
Manually remove the element for those variants we don't want to
keep.
The form in the element is used for the cancer.gov roster if the
element is present. Otherwise, the name is generated from the
Person
document.
(2) Put the optional element in the schema. Don't populate it when
we
automatically populate the data for this task. Manually add the
element for variants you want to preserve. The form in the element
is used for the cancer.gov roster if the element is present.
Otherwise,
the name is generated from the Person document.
(3) Don't put the optional element in the schema, making it
impossible
to have a different form of the name in the roster on cancer.gov
once we are populating those pages from the CDR.
Approaches (2) and (3) would produce the same result if no one ever adds the optional element to any of the documents.
BZDATETIME::2010-05-17 14:27:25
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::13
But your original question (if I was understanding correctly) had to do with what name to use for the summary reviewer name, and these options don't address that (do they?).
BZDATETIME::2010-05-17 14:35:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::14
I was assuming that even if we wanted to be able to preserve the ability to have a different form of the name appear in the roster on cancer.gov you'd want us to stick with the name as stored in the PersonNameInformation block of the Person record when we publish the list of reviewers in the summary. That's why I wrote "Unless you tell us otherwise, we won't use those customized forms of the names for the list of reviewers exported with the summaries" in comment #10. Maybe I should give you a call (it is getting complicated).
BZDATETIME::2010-05-17 14:57:47
BZCOMMENTOR::Bob Kline
BZCOMMENT::15
This is against Mahler, so there may be fewer (or more) discrepancies with Bach.
Attachment pdq-board-member-name-deltas.txt has been added with description: Here's a list of the name variants
BZDATETIME::2010-05-20 13:57:38
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::16
Just so we have it recorded in the issue, I did talk to the Board Managers about this yesterday. They were unanimous in thinking that the name we use on Cancer.gov in the roster and in the summary review list should match, and should match the name we have in the Board Member/Person records. They all agreed that they want the name with the middle initial if there is one. All but 2 of the cases on the list of discrepancies had to do with presence of middle initial; in some cases Cancer.gov had it and the CDR didn't, and in other cases it was the reverse. One of the two exceptions was a data error, and the other was a case where we might want to use the "display name" option (Arthur Kim Richey, who wants his name to show as A. Kim Richey).
BZDATETIME::2010-05-21 13:05:16
BZCOMMENTOR::Bob Kline
BZCOMMENT::17
(In reply to comment #16)
> ... a case where we might want to use the "display name"
option
> (Arthur Kim Richey, who wants his name to show as A. Kim
Richey).
It was decided in yesterday's status meeting that we will not support optional "display name" variants. As soon as the schema has been approved on Mahler (#4834), I'll populate the new elements with data from cancer.gov.
BZDATETIME::2010-05-26 12:37:02
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::18
I have approved the schema on Mahler.
BZDATETIME::2010-06-08 15:45:44
BZCOMMENTOR::Bob Kline
BZCOMMENT::19
I have done a test run of the global change job on Mahler. After you've had a chance to glance through the results and confirm that things look OK I'll run live on that server. Then we can move on to Franck.
http://mahler.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2010-06-08_15-41-07
BZDATETIME::2010-06-17 10:27:17
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::20
I looked at a bunch of these (definitely not all of them) and noticed a couple of things that are probably data issues:
CDR0000450761 Oregon Health & Science University Cancer Institute
CDR0000597137 H. Lee Moffitt Cancer Center & Research Institute
(should be an & sign or spell out the word and).
Definitely data issues:
CDR0000369832 President and CEO, Samueli Institute (should we include the title?)
CDR0000539416 Memorial Sloan-Kettering Cancer Center
CDR0000369974 Memorial Sloan Kettering Cancer Center (need hyphen)
I think it looks good and we can move to the next step.
BZDATETIME::2010-06-17 11:18:37
BZCOMMENTOR::Bob Kline
BZCOMMENT::21
(In reply to comment #20)
> I looked at a bunch of these (definitely not all of them) and
noticed a couple
> of things that are probably data issues:
>
>
> CDR0000450761 Oregon Health & Science University Cancer
Institute
>
> CDR0000597137 H. Lee Moffitt Cancer Center & Research
Institute
>
> (should be an & sign or spell out the word and).
That's how an ampersand is represented in the raw XML.
>
> Definitely data issues:
>
Do you want the software to incorporate code to fix these or are you just making notes for changes the users would make after the global is run on Bach?
> CDR0000369832 President and CEO, Samueli Institute (should we
include the
> title?)
Have you decided about this one?
>
> CDR0000539416 Memorial Sloan-Kettering Cancer Center
>
> CDR0000369974 Memorial Sloan Kettering Cancer Center (need
hyphen)
>
> I think it looks good and we can move to the next step.
I'll run a live job on Mahler as soon as I hear back on these questions.
BZDATETIME::2010-06-17 12:32:46
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::22
I was just using the issue to make a note about the data issues I found. You don't need to do anything in the code to fix them. So I think we can move to the next step, which is the live run on Mahler.
BZDATETIME::2010-06-21 09:09:12
BZCOMMENTOR::Bob Kline
BZCOMMENT::23
Please review.
Attachment Request4835-live.log has been added with description: Log from live run on Mahler
BZDATETIME::2010-06-21 15:38:35
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::24
I checked about 10 of these randomly, and they all looked fine to me. I think we can move on to the next step (do a test run on Franck?).
BZDATETIME::2010-06-21 15:48:17
BZCOMMENTOR::Bob Kline
BZCOMMENT::25
Test job run on Franck:
http://franck.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2010-06-21_15-46-16
BZDATETIME::2010-07-07 13:36:34
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::26
I think these look fine on Franck. I think we are ready to move to the next step--is that a test run on Bach?
BZDATETIME::2010-07-07 15:06:18
BZCOMMENTOR::Bob Kline
BZCOMMENT::27
(In reply to comment #26)
> I think these look fine on Franck. I think we are ready to move to
the next
> step--is that a test run on Bach?
We can do that, or live on Franck, whichever you prefer.
BZDATETIME::2010-07-07 15:09:34
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::28
Oh, I skipped that step! A live run on Franck is fine.
BZDATETIME::2010-07-07 15:28:34
BZCOMMENTOR::Bob Kline
BZCOMMENT::29
Attachment Request4835-live.log has been added with description: Log from live run on Franck
BZDATETIME::2010-07-07 16:15:56
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::30
I looked at a few of these and they look okay but I almost missed it since the CSS changes haven't been promoted. I think it would be better to QC it, especially when we get to Bach, when everything is in place. I assume we need to wait for Volker to get back to do that?
BZDATETIME::2010-07-07 16:29:01
BZCOMMENTOR::Bob Kline
BZCOMMENT::31
I just installed the CSS on Franck. How does it look now (you'll need to log out of XMetaL and then back in)?
BZDATETIME::2010-07-08 11:24:39
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::32
I checked a bunch of these and they look great. I think we are ready to move to Bach. We will also need to promote the CSS changes for these elements to Bach in order to QC.
BZDATETIME::2010-07-12 15:25:07
BZCOMMENTOR::Bob Kline
BZCOMMENT::33
I have run a test of the global change on Bach:
http://bach.nci.nih.gov/cgi-bin/cdr/ShowGlobalChangeTestResults.py?dir=2010-07-12_15-23-00
Attachment PDQBoardMembers4835.xml has been added with description: Intermediate file from which population is performed (final)
BZDATETIME::2010-07-19 16:40:48
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::34
I think these look good and we are ready to run it live on Bach when Bob gets back.
BZDATETIME::2010-07-26 14:48:02
BZCOMMENTOR::Bob Kline
BZCOMMENT::35
Attachment Request4835-live.log has been added with description: Log from live run on Bach
BZDATETIME::2010-07-26 14:48:25
BZCOMMENTOR::Bob Kline
BZCOMMENT::36
The production documents have been updated.
BZDATETIME::2010-07-28 14:35:34
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::37
I have asked the Board Managers to take a look at all of their Board Member records to make sure the info is correct (and to correct it if not). I will close this when I close all of the other issues.
BZDATETIME::2010-08-30 11:23:23
BZCOMMENTOR::Margaret Beckwith
BZCOMMENT::38
Everything live on Cancer.gov. Issue closed.
File Name | Posted | User |
---|---|---|
pdq-board-member-name-deltas.txt | 2010-05-17 14:57:47 | |
PDQBoardMembers4835.xml | 2010-07-12 15:25:07 | |
Request4835-live.log | 2010-07-26 14:48:02 | |
Request4835-live.log | 2010-07-07 15:28:34 | |
Request4835-live.log | 2010-06-21 09:09:12 |
Elapsed: 0:00:00.001561