Usecase clustering #90

juliambr · 2017-03-10T12:28:33Z

wrote a usecase for clustering - please review it :)

pandeva · 2017-03-10T13:42:41Z

src/usecase_clustering.Rmd

+An overview over all learners can be found  [here](http://mlr-org.github.io/mlr-tutorial/devel/html/integrated_learners/index.html). You can also call the \texttt{listLearners} command for our specific task.
+
+
+```{r, warning=FALSE, eval = FALSE}


You can leave warning=FALSE out, because travis has all packages and no warning will be produced

schiffner

Hi, looks very good already.
I just skimmed over it very quickly and commented on some technical stuff.

schiffner · 2017-03-10T16:56:38Z

src/usecase_clustering.Rmd

+set.seed(1234)
+```
+
+This is a use case for clustering with the [%mlr] package. We consider the [agriculture](https://www.rdocumentation.org/packages/cluster/versions/1.10.0/topics/agriculture) dataset that contains observations about $n=12$ countries including  


Please use [agriculture](&cluster::agriculture).
(The build script for the tutorial will expand this to the correct link.)

schiffner · 2017-03-10T16:57:51Z

src/usecase_clustering.Rmd

+
+```{r, fig.width = 5}
+library("cluster")
+data(agriculture)


Please use data(agriculture, package "cluster")

schiffner · 2017-03-10T16:59:07Z

src/usecase_clustering.Rmd

+
+So let's have a look at the data first.
+
+```{r, fig.width = 5}


Please specify the aspect ratio (fig.asp) instead of the fig.width.
(This works better for the pdf version of the tutorial.)

schiffner · 2017-03-10T16:59:41Z

src/usecase_clustering.Rmd

+
+* define the learning task  ([here](http://mlr-org.github.io/mlr-tutorial/devel/html/task/index.html)),
+* select a learning method ([here](http://mlr-org.github.io/mlr-tutorial/devel/html/learner/index.html)),
+* train the learner with data ([here](http://mlr-org.github.io/mlr-tutorial/devel/html/train/index.html)), 


I think "train the learner" is sufficient.

schiffner · 2017-03-10T17:00:38Z

src/usecase_clustering.Rmd

+We now have to define a clustering task. Notice that a clustering task doesn't have a target variable. 
+
+```{r message = FALSE}
+library(mlr)


You don't need library(mlr) and then can also leave out the message = FALSE option.

schiffner · 2017-03-10T17:08:10Z

src/usecase_clustering.Rmd

+
+Tuning will address the question of choosing the best hyperparameters for our problem.
+
+We first create a search space for the number of clusters $k$, e. g. $k \in \lbrace 2, 3, 4, 5 \rbrace$. Further we define an optimization algorithm and a [resampling strategy](http://mlr-org.github.io/mlr-tutorial/devel/html/resample/index.html).


As above you need to link to resample.md.

schiffner · 2017-03-10T17:08:33Z

src/usecase_clustering.Rmd

+
+We first create a search space for the number of clusters $k$, e. g. $k \in \lbrace 2, 3, 4, 5 \rbrace$. Further we define an optimization algorithm and a [resampling strategy](http://mlr-org.github.io/mlr-tutorial/devel/html/resample/index.html).
+
+Finally, by combining all the previous pieces, we can tune the parameter $k$ by calling \texttt{tuneParams}. We will use discrete_ps with grid search and the silhouette coefficient as optimization criterion:


[&tuneParams]

I would also mention 3-fold cross-validation.

schiffner · 2017-03-10T17:10:22Z

src/usecase_clustering.Rmd

+discrete_ps = makeParamSet(makeDiscreteParam("centers", values = c(2, 3, 4, 5)))
+ctrl = makeTuneControlGrid()
+res = tuneParams(cluster.lrn, agri.task, measures = silhouette, resampling = cv3, 
+                 par.set = discrete_ps, control = ctrl)


Could you please indent code by 2 spaces?

schiffner · 2017-03-10T17:11:08Z

src/usecase_clustering.Rmd

+
+This is our final clustering for our problem.
+
+```{r, fig.width= 5}


Please use fig.asp.

schiffner · 2017-03-10T17:12:36Z

src/usecase_clustering.Rmd

+This is our final clustering for our problem.
+
+```{r, fig.width= 5}
+plot(y ~ x, col = tuned.pred$data$response, data = agriculture)


Could you please use the getter function (I think getPredictionResponse should work here)?

SteveBronder · 2017-03-19T20:13:39Z

src/use_case_classification.Rmd

+head(data)
+```
+
+Our aim - as mentioned before - is to predict which kind of people would have survided.


survided

typo

SteveBronder · 2017-03-19T20:14:23Z

src/use_case_classification.Rmd

+
+#### Preprocessing
+
+The data set is corrected regarding their data types.


I would do str(data) to show the different types, then mention how and why they need corrected

pat-s · 2018-06-21T18:17:31Z

@juliambr Do you still have motivation to finish this up here? Would be great! 🎉

juliambr added 7 commits March 9, 2017 11:56

added usecase clustering

ed8384b

worked on usecase clustering, section evaluation still missing

8e4c384

added section performance and tuning

26128fe

fixed some typos, added some links

4481547

removed redundant section on preprocessing

f95dfe2

changed resampling strategy in section tuning

6a7b525

added some links

94fcbe6

juliambr requested review from gcskoenig, juliabrosig, engelhardtk and pandeva March 10, 2017 12:28

pandeva reviewed Mar 10, 2017

View reviewed changes

schiffner suggested changes Mar 10, 2017

View reviewed changes

schiffner mentioned this pull request Mar 10, 2017

the the quickstart page really sucks. should we change it? #56

Closed

modificated links and made suggested changes

8c24c81

SteveBronder reviewed Mar 19, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Usecase clustering #90

Usecase clustering #90

Uh oh!

juliambr commented Mar 10, 2017

Uh oh!

pandeva Mar 10, 2017

Uh oh!

schiffner left a comment

Uh oh!

schiffner Mar 10, 2017 •

edited

Loading

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

schiffner Mar 10, 2017

Uh oh!

SteveBronder Mar 19, 2017

Uh oh!

SteveBronder Mar 19, 2017

Uh oh!

pat-s commented Jun 21, 2018

Uh oh!

Uh oh!

		An overview over all learners can be found [here](http://mlr-org.github.io/mlr-tutorial/devel/html/integrated_learners/index.html). You can also call the \texttt{listLearners} command for our specific task.


		```{r, warning=FALSE, eval = FALSE}


		So let's have a look at the data first.

		```{r, fig.width = 5}


		Tuning will address the question of choosing the best hyperparameters for our problem.

		We first create a search space for the number of clusters $k$, e. g. $k \in \lbrace 2, 3, 4, 5 \rbrace$. Further we define an optimization algorithm and a [resampling strategy](http://mlr-org.github.io/mlr-tutorial/devel/html/resample/index.html).


		We first create a search space for the number of clusters $k$, e. g. $k \in \lbrace 2, 3, 4, 5 \rbrace$. Further we define an optimization algorithm and a [resampling strategy](http://mlr-org.github.io/mlr-tutorial/devel/html/resample/index.html).

		Finally, by combining all the previous pieces, we can tune the parameter $k$ by calling \texttt{tuneParams}. We will use discrete_ps with grid search and the silhouette coefficient as optimization criterion:


		This is our final clustering for our problem.

		```{r, fig.width= 5}


		#### Preprocessing

		The data set is corrected regarding their data types.

Usecase clustering #90

Are you sure you want to change the base?

Usecase clustering #90

Uh oh!

Conversation

juliambr commented Mar 10, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

schiffner left a comment

Choose a reason for hiding this comment

Uh oh!

schiffner Mar 10, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pat-s commented Jun 21, 2018

Uh oh!

Uh oh!

schiffner Mar 10, 2017 •

edited

Loading