Skip to content
/ x2y Public

A generalisation of Rviews article from Rama Ramakrishnan

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

cregouby/x2y

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

x2y

The goal of {x2y} is to provide column-to-column mutual information gain for large dataframe, so that you can remove noise or redundant column based on their information gain for a machine learning modeling.

This is a generalization of the fantastic RViews blog post from Rama Ramakrishnan

Installation

You can install the development version of x2y from GitHub with:

# install.packages("pak")
pak::pak("cregouby/x2y")

Example

This is a basic example which shows you how to solve a common problem:

library(x2y)
library(ggplot2)
## basic example code
set.seed(42)
x <- seq(-1,1,0.01)

circular_df <- data.frame(x = x,
                          y = sqrt(1 - x^2) + rnorm(length(x),mean = 0, sd = 0.05),
                          z = rnorm(length(x))
)

ggplot(circular_df, aes(x = x, y = y)) +
  geom_point() 

Here is the mutual information gain provided by {x2y} for this example

dx2y(circular_df)
#>   x y perc_of_obs   x2y
#> 1 x y         100 68.88
#> 2 z y         100 14.30
#> 3 y x         100 10.20
#> 4 y z         100  9.86
#> 5 z x         100  9.76
#> 6 x z         100  1.33

Related work

About

A generalisation of Rviews article from Rama Ramakrishnan

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages