Issue Number | 2246 |
---|---|
Summary | External Mapping table - normalizing data |
Created | 2007-06-15 16:16:01 |
Issue Type | Improvement |
Submitted By | priced |
Assigned To | Kline, Bob (NIH/NCI) [C] |
Status | Closed |
Resolved | 2007-06-27 17:09:10 |
Resolution | Won't Fix |
Path | /home/bkline/backups/jira/ocecdr/issue.106574 |
BZISSUE::3330
BZDATETIME::2007-06-15 16:16:01
BZCREATOR::Sheri Khanna
BZASSIGNEE::Bob Kline
BZQACONTACT::Sheri Khanna
We need to normalize data mapping rules to ignore spaces, periods, and hyphens in Facility names to help facilitate better matching, if possible.
I will attach an example of a Facility that is in the mapping table several times because of these issues.
BZDATETIME::2007-06-15 16:16:52
BZCOMMENTOR::Sheri Khanna
BZCOMMENT::1
Attachment Mapping_Normalizingproblem_example.doc has been added with description: CTGov Facility example
BZDATETIME::2007-06-16 05:02:53
BZCOMMENTOR::Lakshmi Grama
BZCOMMENT::2
Some of the issues in your example are not related to spacing, hyphens, or periods. They relate to data - e.g some zips have 5+4, some parts of the name are dropped in other variations. We can certainly try to normalize spaces and commas, and see how much we can cut down in variants, but it seems to me that the primary problem is that we cannot only try to match on name of the organization. We need the other information
BZDATETIME::2007-06-21 10:25:10
BZCOMMENTOR::Bob Kline
BZCOMMENT::3
I thought I had posted a comment similar to Lakshmi's last week, but I must have missed a Bugzilla "you're not logged in" message. I had done some investigation into the case you posted as an example, and discovered that while you found nine entries in the external_map table for the same organization, the modification you're requesting would eliminate only one of these. As Lakshmi points out, the overwhelming majority of the differences are in discrepancies in the data provided by the source which cannot be normalized away. I can implement the change you request (we'll need to modify the existing data, too), but I question whether the payoff would be worth it, at least based on the cited example of the problem. Do you still want me to proceed with the work on this request?
BZDATETIME::2007-06-27 17:09:10
BZCOMMENTOR::Sheri Khanna
BZCOMMENT::4
(In reply to comment #3)
> but I question whether the payoff would be worth it, at least based
on the
> cited example of the problem. Do you still want me to proceed with
the work >on this request?
Trying to normalize the mapping was an issue that OCCM and CIAT had
talked about at one of the Prot. Admin meetings, so this was on our list
of issues to address. It doesn't sound like the end result in this case
would would be worth
the work, so I will close this issue for now.
Resolution of the issue has been set to "Won't fix".
Setting status to 'Closed'.
File Name | Posted | User |
---|---|---|
Mapping_Normalizingproblem_example.doc | 2007-06-15 16:16:52 |
Elapsed: 0:00:00.001227