multiple phonenumbers / emailadresses per customer #1079
Replies: 2 comments
-
As a new user of
The pivoted (wide) form violates the assumption of independent variables. A. There are two |
Beta Was this translation helpful? Give feedback.
-
Hi, I faced the same issue and started by creating columns as you suggest phone1, phone2. This though does not work well for writing comparison on the multi-valued attribute: the phone in your case. The approach I took was to create the list of values as an array type under a single column: let say If you need to apply blocking rules on this multi-valued data, you may want to evaluate the percentage of records having a single value. If that proportion is significant you could apply the blocking rule, say Kind regards |
Beta Was this translation helpful? Give feedback.
-
Hi, I just starting to look into splink for a MDM challenge.
When I have the MDM data from 10 sources.
And a customer-record can have multiple phonenumbers and emailadresses (up to 6 for some syststems!).
How do I deal with this? Do I need to pivot the data into phonenumber1, phonenumber2, phonenumber3, etc.?
And then make matching conditions for all combinations of phonenumbers?
Of do I store the unpivoted with all combinations of email/phonenumber for a customer?
(next to phone number there are postal adresses, customernames, emailadresses, company names, but the above shows the problem i am having how to load multiple 'columns' of the same informationtype).
Or is there a different way to go about this?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions