x2y

The goal of {x2y} is to provide column-to-column mutual information gain for large dataframe, so that you can remove noise or redundant column based on their information gain for a machine learning modeling.

This is a generalization of the fantastic RViews blog post from Rama Ramakrishnan

Installation

You can install the development version of x2y from GitHub with:

# install.packages("pak")
pak::pak("cregouby/x2y")

Example

This is a basic example which shows you how to solve a common problem:

library(x2y)
library(ggplot2)
## basic example code
set.seed(42)
x <- seq(-1,1,0.01)

circular_df <- data.frame(x = x,
                          y = sqrt(1 - x^2) + rnorm(length(x),mean = 0, sd = 0.05),
                          z = rnorm(length(x))
)

ggplot(circular_df, aes(x = x, y = y)) +
  geom_point()

Here is the mutual information gain provided by {x2y} for this example

dx2y(circular_df)
#>   x y perc_of_obs   x2y
#> 1 x y         100 68.88
#> 2 z y         100 14.30
#> 3 y x         100 10.20
#> 4 y z         100  9.86
#> 5 z x         100  9.76
#> 6 x z         100  1.33

Related work

{lares} package is having the x2y() function from the same inspiration.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
R		R
data-raw		data-raw
data		data
man		man
tests		tests
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

x2y

Installation

Example

Related work

About

Licenses found

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Licenses found

cregouby/x2y

Folders and files

Latest commit

History

Repository files navigation

x2y

Installation

Example

Related work

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages