Skip to content

Commit

Permalink
Handling emojis in gersam
Browse files Browse the repository at this point in the history
  • Loading branch information
Yomguithereal committed May 5, 2021
1 parent 3fcb2e3 commit 7030ea5
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 1 deletion.
3 changes: 2 additions & 1 deletion src/tokenizers/words/gersam.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions test/tokenizers/words/gersam.js
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@ describe('gersam', function() {
lang: 'fr',
text: 'Les É.U. sont nuls.',
tokens: ['Les', 'É.U.', 'sont', 'nuls', '.']
},
{
lang: 'en',
text: 'This is a very nice cat 🐱! No?',
tokens: ['This', 'is', 'a', 'very', 'nice', 'cat', '🐱', '!', 'No', '?']
}
];

Expand Down

0 comments on commit 7030ea5

Please sign in to comment.