1010
1111The ` modelStudio ` package ** automates the Explanatory Analysis of Machine Learning predictive models** . Generate advanced interactive and animated model explanations in the form of a ** serverless HTML site** with only one line of code. This tool is model agnostic, therefore compatible with most of the black box predictive models and frameworks (e.g.  ; ` mlr/mlr3 ` , ` xgboost ` , ` caret ` , ` h2o ` , ` scikit-learn ` , ` lightGBM ` , ` keras/tensorflow ` ).
1212
13- The main ` modelStudio() ` function computes various (instance and dataset level) model explanations and produces an ** interactive,  ; customisable dashboard made with D3.js** . It consists of multiple panels for plots with their short descriptions. Easily  ; ** save  ; and  ; share** the dashboard with others. Tools for model exploration unite with tools for EDA (Exploratory Data Analysis) to give a broad overview of the model behavior.
13+ The main ` modelStudio() ` function computes various (instance and dataset level) model explanations and produces an  ; ** interactive,  ; customisable dashboard made with D3.js** . It consists of multiple panels for plots with their short descriptions. Easily  ; ** save  ; and  ; share** the dashboard with others. Tools for model exploration unite with tools for EDA (Exploratory Data Analysis) to give a broad overview of the model behavior.
1414
1515<!-- - [explain FIFA19](https://pbiecek.github.io/explainFIFA19/)   --->
1616<!-- - [explain Lung Cancer](https://github.com/hbaniecki/transparent_xai/)   --->
1717&emsp ; &emsp ; &emsp ; &emsp ; &emsp ; &emsp ;
1818[ ** explain FIFA20** ] ( https://pbiecek.github.io/explainFIFA20/ ) &emsp ;
19- [ ** R & Python examples** ] ( http://modelstudio.drwhy.ai/articles/vignette_examples .html ) &emsp ;
19+ [ ** R & Python examples** ] ( http://modelstudio.drwhy.ai/articles/ms-r-python-examples .html ) &emsp ;
2020[ ** More Resources** ] ( http://modelstudio.drwhy.ai/#more-resources ) &emsp ;
2121[ ** FAQ & Troubleshooting** ] ( https://github.com/ModelOriented/modelStudio/issues/54 )
2222
@@ -41,9 +41,7 @@ library("DALEX")
4141library(" modelStudio" )
4242
4343# fit a model
44- model <- glm(survived ~ . ,
45- data = titanic_imputed ,
46- family = " binomial" )
44+ model <- glm(survived ~ . , data = titanic_imputed , family = " binomial" )
4745
4846# create an explainer for the model
4947explainer <- explain(model ,
@@ -59,18 +57,18 @@ modelStudio(explainer)
5957
6058![ ] ( man/figures/long.gif )
6159
62- ## R & Python Examples [ more] ( http://modelstudio.drwhy.ai/articles/vignette_examples .html )
60+ ## R & Python Examples [ more] ( http://modelstudio.drwhy.ai/articles/ms-r-python-examples .html )
6361
6462The ` modelStudio() ` function uses ` DALEX ` explainers created with ` DALEX::explain() ` or ` DALEXtra::explain_*() ` .
6563
6664``` r
67- # update main dependencies
68- install.packages(" ingredients" )
69- install.packages(" iBreakDown" )
70-
7165# packages for explainer objects
7266install.packages(" DALEX" )
7367install.packages(" DALEXtra" )
68+
69+ # update main dependencies
70+ install.packages(" ingredients" )
71+ install.packages(" iBreakDown" )
7472```
7573
7674### mlr [ dashboard] ( https://modeloriented.github.io/modelStudio/mlr.html )
@@ -87,19 +85,16 @@ data <- DALEX::titanic_imputed
8785
8886# split the data
8987index <- sample(1 : nrow(data ), 0.7 * nrow(data ))
90- train <- data [index , ]
91- test <- data [- index , ]
88+ train <- data [index ,]
89+ test <- data [- index ,]
9290
9391# mlr ClassifTask takes target as factor
9492train $ survived <- as.factor(train $ survived )
9593
9694# fit a model
97- task <- makeClassifTask(id = " titanic" ,
98- data = train ,
99- target = " survived" )
95+ task <- makeClassifTask(id = " titanic" , data = train , target = " survived" )
10096
101- learner <- makeLearner(" classif.ranger" ,
102- predict.type = " prob" )
97+ learner <- makeLearner(" classif.ranger" , predict.type = " prob" )
10398
10499model <- train(learner , task )
105100
@@ -110,7 +105,7 @@ explainer <- explain_mlr(model,
110105 label = " mlr" )
111106
112107# pick observations
113- new_observation <- test [1 : 2 , ]
108+ new_observation <- test [1 : 2 ,]
114109rownames(new_observation ) <- c(" id1" , " id2" )
115110
116111# make a studio for the model
@@ -132,17 +127,18 @@ data <- DALEX::titanic_imputed
132127
133128# split the data
134129index <- sample(1 : nrow(data ), 0.7 * nrow(data ))
135- train <- data [index , ]
136- test <- data [- index , ]
130+ train <- data [index ,]
131+ test <- data [- index ,]
137132
138133train_matrix <- model.matrix(survived ~ . - 1 , train )
139134test_matrix <- model.matrix(survived ~ . - 1 , test )
140135
141136# fit a model
142137xgb_matrix <- xgb.DMatrix(train_matrix , label = train $ survived )
143- params <- list (eta = 0.01 , subsample = 0.6 , max_depth = 7 , min_child_weight = 3 ,
144- objective = " binary:logistic" , eval_metric = " auc" )
145- model <- xgb.train(params , xgb_matrix , nrounds = 1000 )
138+
139+ params <- list (max_depth = 7 , objective = " binary:logistic" , eval_metric = " auc" )
140+
141+ model <- xgb.train(params , xgb_matrix , nrounds = 500 )
146142
147143# create an explainer for the model
148144explainer <- explain(model ,
@@ -151,7 +147,7 @@ explainer <- explain(model,
151147 label = " xgboost" )
152148
153149# pick observations
154- new_observation <- test_matrix [1 : 2 ,, drop = FALSE ]
150+ new_observation <- test_matrix [1 : 2 , , drop = FALSE ]
155151rownames(new_observation ) <- c(" id1" , " id2" )
156152
157153# make a studio for the model
@@ -170,6 +166,11 @@ pip3 install dalex --force
170166
171167Use ` pickle ` Python module and ` reticulate ` R package to easily make a studio for a model.
172168
169+ ``` {r eval = FALSE}
170+ # package for pickle load
171+ install.packages("reticulate")
172+ ```
173+
173174In this example we will fit a ` Pipeline MLPClassifier ` model on ` titanic ` data.
174175
175176First, use ` dalex ` in Python:
@@ -193,45 +194,47 @@ y = data.survived
193194X_train, X_test, y_train, y_test = train_test_split(X, y)
194195
195196# fit a pipeline model
196- numeric_features = [' age' , ' fare' , ' sibsp' , ' parch' ]
197- numeric_transformer = Pipeline(
197+ numerical_features = [' age' , ' fare' , ' sibsp' , ' parch' ]
198+ numerical_transformer = Pipeline(
198199 steps = [
199200 (' imputer' , SimpleImputer(strategy = ' median' )),
200201 (' scaler' , StandardScaler())
201- ]
202+ ]
202203)
203204categorical_features = [' gender' , ' class' , ' embarked' ]
204205categorical_transformer = Pipeline(
205206 steps = [
206207 (' imputer' , SimpleImputer(strategy = ' constant' , fill_value = ' missing' )),
207208 (' onehot' , OneHotEncoder(handle_unknown = ' ignore' ))
208- ]
209+ ]
209210)
210211
211212preprocessor = ColumnTransformer(
212213 transformers = [
213- (' num' , numeric_transformer, numeric_features ),
214+ (' num' , numerical_transformer, numerical_features ),
214215 (' cat' , categorical_transformer, categorical_features)
215- ]
216+ ]
216217)
217218
219+ classifier = MLPClassifier(hidden_layer_sizes = (150 ,100 ,50 ), max_iter = 500 )
220+
218221model = Pipeline(
219222 steps = [
220223 (' preprocessor' , preprocessor),
221- (' classifier' , MLPClassifier( hidden_layer_sizes = ( 150 , 100 , 50 ), max_iter = 500 ) )
222- ]
224+ (' classifier' , classifier )
225+ ]
223226)
224227model.fit(X_train, y_train)
225228
226229# create an explainer for the model
227- explainer = dx.Explainer(model, X_test, y_test, label = ' scikit-learn' )
230+ explainer = dx.Explainer(model, data = X_test, y = y_test, label = ' scikit-learn' )
228231
229232# ! remove residual_function before dump !
230233explainer.residual_function = None
231234
232235# pack the explainer into a pickle file
233236import pickle
234- pickle_out = open (" explainer_scikitlearn.pickle" , " wb " )
237+ pickle_out = open (' explainer_scikitlearn.pickle' , ' wb ' )
235238pickle.dump(explainer, pickle_out)
236239pickle_out.close()
237240```
@@ -241,7 +244,7 @@ Then, use `modelStudio` in R:
241244``` r
242245# load the explainer from the pickle file
243246library(reticulate )
244- explainer <- py_load_object(' explainer_scikitlearn.pickle' , pickle = " pickle" )
247+ explainer <- py_load_object(" explainer_scikitlearn.pickle" , pickle = " pickle" )
245248
246249# make a studio for the model
247250library(modelStudio )
@@ -261,9 +264,9 @@ or with [`r2d3::save_d3_html()`](https://rstudio.github.io/r2d3/articles/publish
261264
262265 - Theoretical introduction to the plots: [ Explanatory Model Analysis. Explore, Explain and Examine Predictive Models.] ( https://pbiecek.github.io/ema )
263266
264- - Vignette: [ modelStudio - R & python examples] ( https://modeloriented.github.io/modelStudio/articles/vignette_examples .html )
267+ - Vignette: [ modelStudio - R & Python examples] ( https://modeloriented.github.io/modelStudio/articles/ms-r-python-examples .html )
265268
266- - Vignette: [ modelStudio - perks and features] ( https://modeloriented.github.io/modelStudio/articles/vignette_modelStudio .html )
269+ - Vignette: [ modelStudio - perks and features] ( https://modeloriented.github.io/modelStudio/articles/ms-perks-features .html )
267270
268271 - Conference poster: [ MLinPL2019] ( misc/MLinPL2019_modelStudio_poster.pdf )
269272
0 commit comments