Skip to content

Conversation

@DavisVaughan
Copy link
Member

@DavisVaughan DavisVaughan commented Oct 2, 2025

Finally able to remove dplyr:::vec_case_when() 🎉 , it has served us well over the years 🫡


100 million values, 3 conditions

cross::bench_branches(\(x) {
  library(dplyr)
  set.seed(123)

  values <- list(
    sample(1000000, 1e8, replace = TRUE),
    sample(1000000, 1e8, replace = TRUE),
    sample(1000000, 1e8, replace = TRUE)
  )

  conditions <- list(
    sample(c(TRUE, FALSE), 1e8, replace = TRUE),
    sample(c(TRUE, FALSE), 1e8, replace = TRUE),
    sample(c(TRUE, FALSE), 1e8, replace = TRUE)
  )

  bench::mark(
    case_when(
      conditions[[1]] ~ values[[1]],
      conditions[[2]] ~ values[[2]],
      conditions[[3]] ~ values[[3]]
    ),
    iterations = 5
  )
})
# A tibble: 2 × 14
  branch        expression      min  median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result
  <chr>         <bch:expr> <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>
1 feature/case… case_when… 158.15ms 173.4ms     5.77    381.6MB     8.65     2     3    346.8ms <int> 
2 main          case_when…    2.23s   2.25s     0.428     4.1GB     1.11     5    13      11.7s <int> 
# ℹ 3 more variables: memory <list>, time <list>, gc <list>

The most satisfying part is looking at the memory allocations. This is the result of A LOT of vctrs work! 🎉

With this PR:

Memory allocations:
Number of 'new page' entries not displayed: 1
       what     bytes                                                                    calls
1     alloc       280 case_when() -> case_formula_evaluate() -> new_environment() -> new.env()
2     alloc       216              case_when() -> case_when_size_common() -> vec_size_common()
3     alloc       216              case_when() -> case_when_size_common() -> vec_size_common()
5     alloc       216                                           case_when() -> vec_case_when()
6     alloc 400000048                                           case_when() -> vec_case_when()
total       400000976   

With CRAN dplyr:

Memory allocations:
       what      bytes                                                                                            calls
1     alloc        280                         case_when() -> case_formula_evaluate() -> new_environment() -> new.env()
2     alloc        216                                                                 case_when() -> vec_size_common()
3     alloc        216 case_when() -> vec_case_when() -> names_as_error_names() -> vec_paste0() -> vec_recycle_common()
4     alloc        216 case_when() -> vec_case_when() -> names_as_error_names() -> vec_paste0() -> vec_recycle_common()
5     alloc        216                                             case_when() -> vec_case_when() -> vec_ptype_common()
6     alloc  400000048                                                      case_when() -> vec_case_when() -> vec_rep()
7     alloc  400000048                                                                   case_when() -> vec_case_when()
8     alloc  400000056                                                        case_when() -> vec_case_when() -> which()
9     alloc  199973472                                                        case_when() -> vec_case_when() -> which()
10    alloc  400000048                                                                   case_when() -> vec_case_when()
11    alloc  400000056                                                        case_when() -> vec_case_when() -> which()
12    alloc  100008272                                                        case_when() -> vec_case_when() -> which()
13    alloc  400000048                                                                   case_when() -> vec_case_when()
14    alloc  400000056                                                        case_when() -> vec_case_when() -> which()
15    alloc   50016976                                                        case_when() -> vec_case_when() -> which()
16    alloc  400000056                                                        case_when() -> vec_case_when() -> which()
17    alloc   50001472                                                        case_when() -> vec_case_when() -> which()
18    alloc  199973472                                                    case_when() -> vec_case_when() -> vec_slice()
19    alloc  100008272                                                    case_when() -> vec_case_when() -> vec_slice()
20    alloc   50016976                                                    case_when() -> vec_case_when() -> vec_slice()
21    alloc   50001472                                                  case_when() -> vec_case_when() -> vec_recycle()
22    alloc  400000048                                                  case_when() -> vec_case_when() -> list_unchop()
total       4400001992                                                                   

@DavisVaughan DavisVaughan merged commit ef1498e into main Oct 2, 2025
14 checks passed
@DavisVaughan DavisVaughan deleted the feature/case-when-on-vctrs branch October 2, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants