R script to calculate entropy and info gain
Uses the standard Shannon Entropy formula.
Allows you to specify which variable in a dataset is to be the parent node, and then calculates the entropy of all other variables WRT that parent node.
- library(devtools)
- devtools::install_github('scottroot/shannon-entropy')
- library(shannon-entropy)
- ig(root, df, ignore)
-
root The parent root variable.
-
df Your dataset or dataframe.
-
ignore The variables or columns in your dataset that you don't want to include
A dataframe with all the variables' entropy and information gain.