Replies: 1 comment
-
Hi Nick. This is great feedback, thanks, and nice research on how it's done in dedupe.io - very useful. I agree that choosing blocking rules is more painful than it needs to be, and that we should aim to automate this. Some thoughts:
I'd be interested in your thoughts about the above, and whether you agree. All above is a gut feeling - I can’t prove it quantitively. *Humans are often quite bad at manual labelling in our experience |
Beta Was this translation helpful? Give feedback.
-
Posting this here, create an issue from this if you want.
One of the most time-consuming steps for me is tuning the blocking rules. Too loose, and I get too many pairs, too strict and I get low recall. It's currently painful to manually tweak these.
The dedupe library learns the blocking rules for you. Could we do something like that?
I believe they
refs:
Beta Was this translation helpful? Give feedback.
All reactions