Skip to content

Commit fbae5cf

Browse files
authored
Supersede case_match() (#7735)
* Supersede `case_match()` * Use alphabetical order for `Superseded` section
1 parent 1837d55 commit fbae5cf

File tree

7 files changed

+174
-120
lines changed

7 files changed

+174
-120
lines changed

DESCRIPTION

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Imports:
2525
cli (>= 3.6.2),
2626
generics,
2727
glue (>= 1.3.2),
28-
lifecycle (>= 1.0.3),
28+
lifecycle (>= 1.0.4.9000),
2929
magrittr (>= 1.5),
3030
methods,
3131
pillar (>= 1.9.0),
@@ -67,4 +67,5 @@ LazyData: true
6767
Roxygen: list(markdown = TRUE)
6868
RoxygenNote: 7.3.3
6969
Remotes:
70+
r-lib/lifecycle,
7071
r-lib/vctrs

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# dplyr (development version)
22

3+
* `case_match()` is now superseded by `recode_values()` and `replace_values()`.
4+
35
* The superseded `recode()` now has updated documentation showing how to migrate to `recode_values()` and `replace_values()`.
46

57
* `case_when()` is now part of a family of 4 related functions, 3 of which are new:

R/case-match.R

Lines changed: 84 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,19 @@
11
#' A general vectorised `switch()`
22
#'
33
#' @description
4+
#' `r lifecycle::badge("superseded")`
5+
#'
6+
#' `case_match()` is superseded by [recode_values()] and [replace_values()],
7+
#' which are more powerful, have more intuitive names, and have better safety.
8+
#' In addition to the familiar two-sided formula interface, these functions also
9+
#' have `from` and `to` arguments which allow you to incorporate a lookup table
10+
#' into the recoding process.
11+
#'
412
#' This function allows you to vectorise multiple [switch()] statements. Each
513
#' case is evaluated sequentially and the first match for each element
614
#' determines the corresponding value in the output vector. If no cases match,
715
#' the `.default` is used.
816
#'
9-
#' `case_match()` is an R equivalent of the SQL "simple" `CASE WHEN` statement.
10-
#'
11-
#' ## Connection to `case_when()`
12-
#'
13-
#' While [case_when()] uses logical expressions on the left-hand side of the
14-
#' formula, `case_match()` uses values to match against `.x` with. The following
15-
#' two statements are roughly equivalent:
16-
#'
17-
#' ```
18-
#' case_when(
19-
#' x %in% c("a", "b") ~ 1,
20-
#' x %in% "c" ~ 2,
21-
#' x %in% c("d", "e") ~ 3
22-
#' )
23-
#'
24-
#' case_match(
25-
#' x,
26-
#' c("a", "b") ~ 1,
27-
#' "c" ~ 2,
28-
#' c("d", "e") ~ 3
29-
#' )
30-
#' ```
31-
#'
3217
#' @param .x A vector to match against.
3318
#'
3419
#' @param ... <[`dynamic-dots`][rlang::dyn-dots]> A sequence of two-sided
@@ -58,61 +43,98 @@
5843
#' A vector with the same size as `.x` and the same type as the common type of
5944
#' the RHS inputs and `.default` (if not overridden by `.ptype`).
6045
#'
61-
#' @seealso [case_when()]
62-
#'
6346
#' @export
6447
#' @examples
48+
#' # `case_match()` has been superseded by `recode_values()` and
49+
#' # `replace_values()`
50+
#'
6551
#' x <- c("a", "b", "a", "d", "b", NA, "c", "e")
6652
#'
67-
#' # `case_match()` acts like a vectorized `switch()`.
68-
#' # Unmatched values "fall through" as a missing value.
53+
#' # `recode_values()` is a 1:1 replacement for `case_match()`
6954
#' case_match(
7055
#' x,
7156
#' "a" ~ 1,
7257
#' "b" ~ 2,
7358
#' "c" ~ 3,
7459
#' "d" ~ 4
7560
#' )
76-
#'
77-
#' # Missing values can be matched exactly, and `.default` can be used to
78-
#' # control the value used for unmatched values of `.x`
79-
#' case_match(
61+
#' recode_values(
8062
#' x,
8163
#' "a" ~ 1,
8264
#' "b" ~ 2,
8365
#' "c" ~ 3,
84-
#' "d" ~ 4,
85-
#' NA ~ 0,
86-
#' .default = 100
66+
#' "d" ~ 4
8767
#' )
8868
#'
89-
#' # Input values can be grouped into the same expression to map them to the
90-
#' # same output value
91-
#' case_match(
69+
#' # `recode_values()` has an additional `unmatched` argument to help you catch
70+
#' # missed mappings
71+
#' try(recode_values(
9272
#' x,
93-
#' c("a", "b") ~ "low",
94-
#' c("c", "d", "e") ~ "high"
73+
#' "a" ~ 1,
74+
#' "b" ~ 2,
75+
#' "c" ~ 3,
76+
#' "d" ~ 4,
77+
#' unmatched = "error"
78+
#' ))
79+
#'
80+
#' # `recode_values()` also has additional `from` and `to` arguments, which are
81+
#' # useful when your lookup table is defined elsewhere (for example, it could
82+
#' # be read in from a CSV file). This is very difficult to do with
83+
#' # `case_match()`!
84+
#' lookup <- tribble(
85+
#' ~from, ~to,
86+
#' "a", 1,
87+
#' "b", 2,
88+
#' "c", 3,
89+
#' "d", 4
9590
#' )
9691
#'
97-
#' # `case_match()` isn't limited to character input:
98-
#' y <- c(1, 2, 1, 3, 1, NA, 2, 4)
92+
#' recode_values(x, from = lookup$from, to = lookup$to)
93+
#'
94+
#' # Both `case_match()` and `recode_values()` work with more than just
95+
#' # character inputs:
96+
#' y <- as.integer(c(1, 2, 1, 3, 1, NA, 2, 4))
9997
#'
10098
#' case_match(
10199
#' y,
102100
#' c(1, 3) ~ "odd",
103101
#' c(2, 4) ~ "even",
104102
#' .default = "missing"
105103
#' )
104+
#' recode_values(
105+
#' y,
106+
#' c(1, 3) ~ "odd",
107+
#' c(2, 4) ~ "even",
108+
#' default = "missing"
109+
#' )
110+
#'
111+
#' # Or with a lookup table
112+
#' lookup <- tribble(
113+
#' ~from, ~to,
114+
#' c(1, 3), "odd",
115+
#' c(2, 4), "even"
116+
#' )
117+
#' recode_values(y, from = lookup$from, to = lookup$to, default = "missing")
106118
#'
107-
#' # Setting `.default` to the original vector is a useful way to replace
108-
#' # selected values, leaving everything else as is
119+
#' # `replace_values()` is a convenient way to replace selected values, leaving
120+
#' # everything else as is. It's similar to `case_match(y, .default = y)`.
121+
#' replace_values(y, NA ~ 0)
109122
#' case_match(y, NA ~ 0, .default = y)
110123
#'
124+
#' # Notably, `replace_values()` is type stable, which means that `y` can't
125+
#' # change types out from under you, unlike with `case_match()`!
126+
#' typeof(y)
127+
#' typeof(replace_values(y, NA ~ 0))
128+
#' typeof(case_match(y, NA ~ 0, .default = y))
129+
#'
130+
#' # We believe that `replace_values()` better expresses intent when doing a
131+
#' # partial replacement. Compare these two `mutate()` calls, each with the
132+
#' # goals of:
133+
#' # - Replace missings in `hair_color`
134+
#' # - Replace some of the `species`
111135
#' starwars |>
112136
#' mutate(
113-
#' # Replace missings, but leave everything else alone
114137
#' hair_color = case_match(hair_color, NA ~ "unknown", .default = hair_color),
115-
#' # Replace some, but not all, of the species
116138
#' species = case_match(
117139
#' species,
118140
#' "Human" ~ "Humanoid",
@@ -122,7 +144,24 @@
122144
#' ),
123145
#' .keep = "used"
124146
#' )
147+
#'
148+
#' updates <- tribble(
149+
#' ~from, ~to,
150+
#' "Human", "Humanoid",
151+
#' "Droid", "Robot",
152+
#' c("Wookiee", "Ewok"), "Hairy"
153+
#' )
154+
#'
155+
#' starwars |>
156+
#' mutate(
157+
#' hair_color = replace_values(hair_color, NA ~ "unknown"),
158+
#' species = replace_values(species, from = updates$from, to = updates$to),
159+
#' .keep = "used"
160+
#' )
125161
case_match <- function(.x, ..., .default = NULL, .ptype = NULL) {
162+
# Superseded in dplyr 1.2.0
163+
lifecycle::signal_stage("superseded", "case_match()", "recode_values()")
164+
126165
# Matching historical behavior of `case_match()`, which was to work like
127166
# `case_when()` and not allow empty `...`. Newer `replace_when()` and
128167
# `replace_values()` are a no-op for this case, but we superseded

_pkgdown.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,6 @@ reference:
8989
not data frames.
9090
contents:
9191
- between
92-
- case_match
9392
- case_when
9493
- recode_values
9594
- coalesce
@@ -129,14 +128,15 @@ reference:
129128
to be superior, but we don't want to force you to change until you're
130129
ready, so the existing functions will stay around for several years.
131130
contents:
131+
- all_vars
132+
- case_match
133+
- recode
132134
- sample_frac
133-
- top_n
134135
- scoped
135-
- ends_with("_at")
136-
- all_vars
136+
- top_n
137137
- vars
138138
- with_groups
139-
- recode
139+
- ends_with("_at")
140140

141141
- title: Remote tables
142142
contents:

0 commit comments

Comments
 (0)