Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Life-Changing Magic of Tidying Text | Julia Silge #98

Open
utterances-bot opened this issue Jan 23, 2024 · 3 comments
Open

The Life-Changing Magic of Tidying Text | Julia Silge #98

utterances-bot opened this issue Jan 23, 2024 · 3 comments

Comments

@utterances-bot
Copy link

The Life-Changing Magic of Tidying Text | Julia Silge

An R package for text mining using tidy data principles

https://juliasilge.com/blog/life-changing-magic/

Copy link

Hello Julia,
I have just started learning Text Mining with R and came across this regular expression " regex("^chapter [\divxlc]", Would you kindly explain what " \divxlc " is searching for? I understand the ^chapter part, however, I dont understand the last part.
Thank you in advance.

@juliasilge
Copy link
Owner

That's a great question @waragamwangi! That is to identify roman numerals, like to find "chapter iv". It doesn't look like it's necessary in this example, but can be good for other datasets.

@waragamwangi
Copy link

Thank you @juliasilge . Its clear now. I can see the regular expression was able to capture chapters in Mansfield Park and Emma books which are written in roman numbers.

That's a great question @waragamwangi! That is to identify roman numerals, like to find "chapter iv". It doesn't look like it's necessary in this example, but can be good for other datasets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants