Description
I am trying to finetune BERT for named entity recognition based on annotated data of call center transcript.
This is a dummy call between 2 agents -
Prabhat - Hello Neeraj, How are you
Neeraj - I am good, Thanks, How are you
Prabhat - I am good as well
Neeraj - For HiHi Phones, we would like to offer 10% discount
Prabhat- I am interested
Neeraj - Could you share your credit card number
Prabhat - Sure one two four
Neeraj - Yes
Prabhat - five three.. no two.. five and nine
Neeraj - Okay, what's next
Prabhat - six two
Neeraj - six and then two
Prabhat - twenty five
Neeraj - So it is five, two, five
Prabhat - Yes
Neeraj - nine, six, two, two and five.
Prabhat - Yes
Neeraj - Thanks for sharing your credit card
Prabhat - Do you need anything else
Neeraj - Yes, what's your customer Id
Prabhat - It is five, two
Neeraj - Okay
Prabhat - three, nine
Neeraj - sure, Thanks. I will place the order for you
we want to detect the numbers after the word 'credit' as B-CREDIT/I-CREDIT. but the numbers after the word customer id should go under B-CUSTOMER/I-CUSTOMER.
after training a bert uncases model, getting a very random output of numbers.
needed a clarity on 1 detail - named entity recognition happens on entity level, but does BERT keeps context with it (for example - word credit appeared before the number so assign the number to B-CREDIT/I-CREDIT. and if number is occuring after ACCOUNT then number should be detected as B-ACCOUNT/I-ACCOUNT?