You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/dials.Rmd
+36-17
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,9 @@ output:
8
8
toc: yes
9
9
---
10
10
11
-
```{r setup, include = FALSE}
11
+
```{r}
12
+
#| label: setup
13
+
#| include: false
12
14
knitr::opts_chunk$set(
13
15
message = FALSE,
14
16
digits = 3,
@@ -45,14 +47,16 @@ Otherwise, the information contained in parameter objects are different for diff
45
47
46
48
An example of a numeric tuning parameter is the cost-complexity parameter of CART trees, otherwise known as $C_p$. A parameter object for $C_p$ can be created in `dials` using:
47
49
48
-
```{r cp}
50
+
```{r}
51
+
#| label: cp
49
52
library(dials)
50
53
cost_complexity()
51
54
```
52
55
53
56
Note that this parameter is handled in log units and the default range of values is between `10^-10` and `0.1`. The range of possible values can be returned and changed based on some utility functions. We'll use the pipe operator here:
Random values can be sampled too. A random uniform distribution is used (between the range values). Since this parameter has a transformation associated with it, the values are simulated in the transformed scale and then returned in the natural units (although the `original` argument can be used here):
76
81
77
-
```{r cp-sim}
82
+
```{r}
83
+
#| label: cp-sim
78
84
set.seed(5473)
79
85
cost_complexity() %>% value_sample(n = 4)
80
86
```
81
87
82
88
For CART trees, there is a discrete set of values that exist for a given data set. It may be a good idea to assign these possible values to the object. We can get them by fitting an initial `rpart` model and then adding the values to the object. For `mtcars`, there are only three values:
83
89
84
-
```{r rpart, error=TRUE}
90
+
```{r}
91
+
#| label: rpart
92
+
#| error: true
85
93
library(rpart)
86
94
cart_mod <- rpart(mpg ~ ., data = mtcars, control = rpart.control(cp = 0.000001))
Now, if a sequence or random sample is requested, it uses the set values:
105
114
106
-
```{r rpart-cp-vals}
115
+
```{r}
116
+
#| label: rpart-cp-vals
107
117
mtcars_cp %>% value_seq(2)
108
118
# Sampling specific values is done with replacement
109
119
mtcars_cp %>%
@@ -113,7 +123,8 @@ mtcars_cp %>%
113
123
114
124
Any transformations from the `scales` package can be used with the numeric parameters, or a custom transformation generated with `scales::trans_new()`.
115
125
116
-
```{r custom-transform}
126
+
```{r}
127
+
#| label: custom-transform
117
128
trans_raise <- scales::trans_new(
118
129
"raise",
119
130
transform = function(x) 2^x ,
@@ -126,7 +137,8 @@ custom_cost
126
137
Note that if a transformation is used, the `range` argument specifies the parameter range _on the transformed scale_.
127
138
For this version of `cost()`, parameter values are sampled between 1 and 10 and then transformed back to the original scale by the inverse `-log2()`. So on the original scale, the sampled values are between `-log2(10)` and `-log2(1)`.
In the discrete case there is no notion of a range. The parameter objects are defined by their discrete values. For example, consider a parameter for the types of kernel functions that is used with distance functions:
138
150
139
-
```{r wts}
151
+
```{r}
152
+
#| label: wts
140
153
weight_func()
141
154
```
142
155
143
156
The helper functions are analogues to the quantitative parameters:
@@ -159,7 +173,8 @@ The package contains two constructors that can be used to create new quantitativ
159
173
160
174
There are some cases where the range of parameter values are data dependent. For example, the upper bound on the number of neighbors cannot be known if the number of data points in the training set is not known. For that reason, some parameters have an _unknown_ placeholder:
161
175
162
-
```{r unk}
176
+
```{r}
177
+
#| label: unk
163
178
mtry()
164
179
sample_size()
165
180
num_terms()
@@ -169,15 +184,17 @@ num_comp()
169
184
170
185
These values must be initialized prior to generating parameter values. The `finalize()` methods can be used to help remove the unknowns:
171
186
172
-
```{r finalize-mtry}
187
+
```{r}
188
+
#| label: finalize-mtry
173
189
finalize(mtry(), x = mtcars[, -1])
174
190
```
175
191
176
192
## Parameter Sets
177
193
178
194
These are collection of parameters used in a model, recipe, or other object. They can also be created manually and can have alternate identification fields:
0 commit comments