Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多音字测试 #24

Open
liroda opened this issue Apr 23, 2024 · 3 comments
Open

多音字测试 #24

liroda opened this issue Apr 23, 2024 · 3 comments

Comments

@liroda
Copy link

liroda commented Apr 23, 2024

您好,实际使用中,大部分多音字预测效果还可以,对于多音字"厦"预测,这个特别容易出错,这几个句子都会读成xia4
通过西门来访者可以进入大厦。
沈阳皇朝万鑫国际大厦

请问这块是因为实际训练集的影响,还是其它原因呢?看论文实际训练多音字是432个,方便问下具体是哪些多音字吗?

@GitYCC
Copy link
Owner

GitYCC commented Sep 9, 2024

因為這模型的讀音是以台灣念法為主,所以會稍有差別,建議可以自己改動規則去作mapping

@liroda
Copy link
Author

liroda commented Sep 10, 2024

这种情况的话用一些数据,做微调可以吗?

@GitYCC
Copy link
Owner

GitYCC commented Sep 12, 2024

當然,如果有資料去finetune會更好

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants