Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge #77

utterances-bot · 2022-10-22T13:55:17Z

Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge

A data science blog

https://juliasilge.com/blog/stranger-things/

martinsykora · 2022-10-22T13:55:18Z

As a fan of stranger things, I really enjoyed this... very nice blog post and really handy to be able to now tidyup the FREX and Lift measures with tidytext - great work.

m-olaide · 2022-10-22T16:49:01Z

This is another amazing presentation as usual. Thanks for your efforts. I have a couple of questions:

As shown, FREX and LIFT returns different words for each topics. Which of them will you recommend for practical applications?
You mentioned that it's not advisable to "remove stop words before building topic models". However, on the referred link for stm::estimateEffect(), you removed stopwords before building the topic models for that case study. Please advice on the best approach - to remove or not to remove stopwords before building topic models!

Thanks

juliasilge · 2022-10-23T21:24:23Z

@m-olaide Thanks for the great questions!

I have found both FREX and lift words to help people understand what a topic is about; I often would report both. If you want to see which would be more useful in your specific situation, I recommend reading the stm vignette and especially the references in there for how FREX and lift are designed and used.
For the best quality topics, you typically don't want to remove stop words, as explained in the Schofield & Mimno paper I linked in this post. Sometimes I will still remove them to make a quick-and-dirty topic model that doesn't include those super common words that are used in many or all topics.

Kenjd · 2022-12-16T21:40:50Z

Very thankful for all you share, Julia.
Would you have an idea why this error occurs when trying to run the topic_model for "frex"?
I know it worked originally in your video, but now I get this error when I run the code, and I can't track it down.
Any thoughts are appreciated.
Thanks so much.

Error in match.arg(matrix) :
'arg' should be one of “beta”, “gamma”, “theta”

Kenjd · 2022-12-16T21:42:20Z

Sorry, It's the "Stranger Things", Tidy Tuesday entry

juliasilge · 2022-12-16T23:22:37Z

@Kenjd Hmmmm, it's hard to say here because there aren't a lot of details about where you are getting that error.
Can you create a reprex (a minimal reproducible example) for this? The goal of a reprex is to make it easier for us to recreate your problem so that we can understand it and/or fix it. If you've never heard of a reprex before, you may want to start with the tidyverse.org help page.

Once you have a reprex, I recommend posting on RStudio Community, which is a great forum for getting help with these kinds of analysis questions. Thanks! 🙌

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge #77

Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge #77

utterances-bot commented Oct 22, 2022

martinsykora commented Oct 22, 2022

m-olaide commented Oct 22, 2022 •

edited

Loading

juliasilge commented Oct 23, 2022

Kenjd commented Dec 16, 2022

Kenjd commented Dec 16, 2022

juliasilge commented Dec 16, 2022

Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge #77

Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge #77

Comments

utterances-bot commented Oct 22, 2022

Find high FREX and high lift words for #TidyTuesday Stranger Things dialogue | Julia Silge

martinsykora commented Oct 22, 2022

m-olaide commented Oct 22, 2022 • edited Loading

juliasilge commented Oct 23, 2022

Kenjd commented Dec 16, 2022

Kenjd commented Dec 16, 2022

juliasilge commented Dec 16, 2022

m-olaide commented Oct 22, 2022 •

edited

Loading