-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Predict the magnitude of #TidyTuesday tornadoes with effect encoding and xgboost | Julia Silge #92
Comments
Hello Julia, Thanks for another informative post. Your method of handling high cardinality categorical variables through likelihood encoding was interesting. I noticed that 'st' variable is a top contributor to the model. However, the encoding adds a degree of abstraction. I am trying to interpret effects of specific states on the tornado magnitude. Can we somehow map these encoded 'st' values back to the original states for more intuitive interpretation? Could referring to encoded st values themselves provide a straightforward way to understand their effects? Moreover, I am pondering if PDP could be used to further explore the effects of each state. Thanks again for your insightful post. Looking forward to more of it. |
@msahil515 Yes, you can get out the values associated with each value for You could also use a partial dependence profile to examine the results more. I like using |
Hello Julia I would like to use a different encoding method for categorical variables, similar to the internal pca ordering method used by ranger (adapted from Coppersmith). It is target based and so needs to be done on each fold, rather than prior to splitting the data. How would I be able to incorporate this into a recipe step please? Many thanks! |
@smithhelen Take a look at this article on how to create your own recipe step. |
Hello Julia, congrats for your impressive work. I have a question about the Thank you. |
@robsonpro Ah no, if you set
You can provide your own grid in that argument, using any of the kinds of grid specifications outlined in that chapter. If you use the default or do something like |
Thank you so much for your attention and explanation, @juliasilge. I catch that now. |
Predict the magnitude of #TidyTuesday tornadoes with effect encoding and xgboost | Julia Silge
A data science blog
https://juliasilge.com/blog/tornadoes/
The text was updated successfully, but these errors were encountered: