Skip to content

Commit ae0b618

Browse files
committed
Only consult LHS inputs to determine case_when() output size
1 parent 0c01fa5 commit ae0b618

File tree

5 files changed

+130
-63
lines changed

5 files changed

+130
-63
lines changed

NEWS.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# dplyr (development version)
22

3+
* `case_when()` now determines the output size from only the LHS inputs, rather than from both the LHS and RHS inputs. This allows the legacy behavior of `TRUE ~` to continue to work, while improving `case_when()`'s safety by guarding against some silent confusing failure cases (#7082).
4+
35
* `bind_rows()` now replaces empty (or `NA`) element names in a list with its numeric index while preserving existing names (#7719, @Meghansaha).
46

57
* New `slice_sample()` example showing how to use it to shuffle rows (#7707, @Hzanib).

R/case-when.R

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -16,18 +16,18 @@
1616
#'
1717
#' The RHS inputs will be coerced to their common type.
1818
#'
19-
#' All inputs will be recycled to their common size. That said, we encourage
20-
#' all LHS inputs to be the same size. Recycling is mainly useful for RHS
21-
#' inputs, where you might supply a size 1 input that will be recycled to the
22-
#' size of the LHS inputs.
19+
#' For historical reasons, all LHS inputs will be recycled to their common
20+
#' size. That said, we encourage all LHS inputs to be the same size, which you
21+
#' can optionally enforce with `.size`. All RHS inputs will be recycled to the
22+
#' common size of the LHS inputs.
2323
#'
2424
#' `NULL` inputs are ignored.
2525
#'
2626
#' @param .default The value used when all of the LHS inputs return either
2727
#' `FALSE` or `NA`.
2828
#'
2929
#' `.default` must be size 1 or the same size as the common size computed
30-
#' from `...`.
30+
#' from the LHS inputs.
3131
#'
3232
#' `.default` participates in the computation of the common type with the RHS
3333
#' inputs.
@@ -45,11 +45,12 @@
4545
#' supplied, this overrides the common type of the RHS inputs.
4646
#'
4747
#' @param .size An optional size declaring the desired output size. If supplied,
48-
#' this overrides the common size computed from `...`.
48+
#' this overrides the common size computed from the LHS inputs.
4949
#'
50-
#' @return A vector with the same size as the common size computed from the
51-
#' inputs in `...` and the same type as the common type of the RHS inputs
52-
#' in `...`.
50+
#' @return A vector
51+
#'
52+
#' - The size of the vector is the common size of the LHS inputs, or `.size`.
53+
#' - The type of the vector is the common type of the RHS inputs, or `.ptype`.
5354
#'
5455
#' @seealso [case_match()]
5556
#'
@@ -162,12 +163,18 @@ case_when <- function(..., .default = NULL, .ptype = NULL, .size = NULL) {
162163
conditions <- args$lhs
163164
values <- args$rhs
164165

165-
# `case_when()`'s formula interface finds the common size of ALL of its inputs.
166-
# This is what allows `TRUE ~` to work.
167-
.size <- vec_size_common(!!!conditions, !!!values, .size = .size)
168-
166+
# Unfortunate historical behavior patch:
167+
#
168+
# `case_when()`'s formula interface finds the common size of all LHS inputs.
169+
# This is what allows `TRUE ~` to work. Unfortunately, this has caused a lot
170+
# of confusion over the years when `TRUE ~` isn't involved (#7082) and we wish
171+
# that we could enforce that all LHS inputs must be the same size (without
172+
# recycling).
173+
#
174+
# Additionally, for `case_when()`, the LHS common size is the size that each
175+
# RHS value must recycle to.
176+
.size <- vec_size_common(!!!conditions, .size = .size)
169177
conditions <- vec_recycle_common(!!!conditions, .size = .size)
170-
values <- vec_recycle_common(!!!values, .size = .size)
171178

172179
vec_case_when(
173180
conditions = conditions,

man/case_when.Rd

Lines changed: 11 additions & 9 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

tests/testthat/_snaps/case-when.md

Lines changed: 47 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@
2020
case_when(TRUE ~ 1:2, .size = 3)
2121
Condition
2222
Error in `case_when()`:
23-
! Can't recycle `..1 (right)` (size 2) to size 3.
23+
! `..1 (right)` must have size 3, not size 2.
2424

2525
# invalid type errors are correct (#6261) (#6206)
2626

@@ -66,14 +66,57 @@
6666
Caused by error:
6767
! oh no!
6868

69+
# only LHS sizes are consulted to compute the output size (#7082)
70+
71+
Code
72+
x <- 1
73+
case_when(x == 1 ~ "a", x == 2 ~ character(), .default = "other")
74+
Condition
75+
Error in `case_when()`:
76+
! `..2 (right)` must have size 1, not size 0.
77+
78+
---
79+
80+
Code
81+
case_when(TRUE ~ 1:3, FALSE ~ 4:6)
82+
Condition
83+
Error in `case_when()`:
84+
! `..1 (right)` must have size 1, not size 3.
85+
86+
---
87+
88+
Code
89+
case_when(NA ~ 1:3, TRUE ~ 4:6)
90+
Condition
91+
Error in `case_when()`:
92+
! `..1 (right)` must have size 1, not size 3.
93+
94+
---
95+
96+
Code
97+
x <- 1:3
98+
my_scalar_condition <- is.double(x)
99+
case_when(my_scalar_condition ~ x, TRUE ~ 4:6)
100+
Condition
101+
Error in `case_when()`:
102+
! `..1 (right)` must have size 1, not size 3.
103+
104+
---
105+
106+
Code
107+
case_when(TRUE ~ integer(), FALSE ~ integer())
108+
Condition
109+
Error in `case_when()`:
110+
! `..1 (right)` must have size 1, not size 0.
111+
69112
# case_when() give meaningful errors
70113

71114
Code
72115
(expect_error(case_when(c(TRUE, FALSE) ~ 1:3, c(FALSE, TRUE) ~ 1:2)))
73116
Output
74-
<error/vctrs_error_incompatible_size>
117+
<error/vctrs_error_assert_size>
75118
Error in `case_when()`:
76-
! Can't recycle `..1 (left)` (size 2) to match `..1 (right)` (size 3).
119+
! `..1 (right)` must have size 2, not size 3.
77120
Code
78121
(expect_error(case_when(c(TRUE, FALSE) ~ 1, c(FALSE, TRUE, FALSE) ~ 2, c(FALSE,
79122
TRUE, FALSE, NA) ~ 3)))
@@ -86,7 +129,7 @@
86129
Output
87130
<error/rlang_error>
88131
Error in `case_when()`:
89-
! `..1 (left)` must be a logical vector, not a double vector.
132+
! `..1 (left)` must be a logical vector, not the number 50.
90133
Code
91134
(expect_error(case_when(paste(50))))
92135
Output

tests/testthat/test-case-when.R

Lines changed: 49 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -68,40 +68,6 @@ test_that("any `TRUE` overrides an `NA`", {
6868
)
6969
})
7070

71-
test_that("atomic conditions (#2909)", {
72-
expect_equal(
73-
case_when(
74-
TRUE ~ 1:3,
75-
FALSE ~ 4:6
76-
),
77-
1:3
78-
)
79-
expect_equal(
80-
case_when(
81-
NA ~ 1:3,
82-
TRUE ~ 4:6
83-
),
84-
4:6
85-
)
86-
})
87-
88-
test_that("zero-length conditions and values (#3041)", {
89-
expect_equal(
90-
case_when(
91-
TRUE ~ integer(),
92-
FALSE ~ integer()
93-
),
94-
integer()
95-
)
96-
expect_equal(
97-
case_when(
98-
logical() ~ 1,
99-
logical() ~ 2
100-
),
101-
numeric()
102-
)
103-
})
104-
10571
test_that("case_when can be used in anonymous functions (#3422)", {
10672
res <- tibble(a = 1:3) |>
10773
mutate(b = (function(x) case_when(x < 2 ~ TRUE, .default = FALSE))(a)) |>
@@ -210,8 +176,10 @@ test_that("NULL inputs are compacted", {
210176

211177
test_that("passes through `.default` correctly", {
212178
expect_identical(case_when(FALSE ~ 1, .default = 2), 2)
213-
expect_identical(case_when(FALSE ~ 1:5, .default = 2), rep(2, 5))
214-
expect_identical(case_when(FALSE ~ 1:5, .default = 2:6), 2:6)
179+
expect_identical(
180+
case_when(c(TRUE, FALSE, TRUE, FALSE, TRUE) ~ 1:5, .default = 2),
181+
c(1, 2, 3, 2, 5)
182+
)
215183
})
216184

217185
test_that("`.default` isn't part of recycling", {
@@ -269,6 +237,51 @@ test_that("throws chained errors when formula evaluation fails", {
269237
})
270238
})
271239

240+
test_that("only LHS sizes are consulted to compute the output size (#7082)", {
241+
# Motivating example from #7082. We want this case to fail. LHS common size is
242+
# 1. RHS inputs must recycle to this size. If both the LHS and RHS inputs were
243+
# consulted to compute a common size of 0, this would incorrectly return
244+
# `character()`, which is a silent confusing failure.
245+
expect_snapshot(error = TRUE, {
246+
x <- 1
247+
case_when(
248+
x == 1 ~ "a",
249+
x == 2 ~ character(),
250+
.default = "other"
251+
)
252+
})
253+
254+
# The following all used to work but we believe they are not good
255+
# examples of how to use `case_when()`, and are not worth supporting if it
256+
# means that the actual bug captured above silently slips through
257+
258+
# Used to return `1:3` and `4:6` respectively (#2909)
259+
# This could only practically affect you if you are using `case_when()` like
260+
# a very weird if/else, which we do not encourage.
261+
expect_snapshot(error = TRUE, {
262+
case_when(TRUE ~ 1:3, FALSE ~ 4:6)
263+
})
264+
expect_snapshot(error = TRUE, {
265+
case_when(NA ~ 1:3, TRUE ~ 4:6)
266+
})
267+
expect_snapshot(error = TRUE, {
268+
# In theory a user could have done something very weird like this
269+
x <- 1:3
270+
my_scalar_condition <- is.double(x)
271+
case_when(my_scalar_condition ~ x, TRUE ~ 4:6)
272+
})
273+
274+
# Used to return `integer()` (#3041).
275+
# If you go back and look at #3041, it was an issue about the case where all
276+
# LHS inputs have size 0 and all RHS inputs have size 1. That still works as
277+
# expected. This extra test of LHS inputs having size 1 and RHS inputs having
278+
# size 0 was not something the user wanted, and does not make sense to us.
279+
expect_snapshot(error = TRUE, {
280+
case_when(TRUE ~ integer(), FALSE ~ integer())
281+
})
282+
expect_identical(case_when(logical() ~ 1, logical() ~ 2), numeric())
283+
})
284+
272285
test_that("case_when() give meaningful errors", {
273286
expect_snapshot({
274287
(expect_error(

0 commit comments

Comments
 (0)