Skip to content

data.frames with custom S3 classes break dplyr inconsistently with confusing error messages. Incompatible with OOP expectations? #7731

@MilesMcBain

Description

@MilesMcBain

If you happen to have an extra class on your data.frame or tibble it breaks much of dplyr. The error messages are a bit confusing since they claim the data.frame should be a vector.

summarise appears to be unperturbed which is interesting.

Reprex:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
classy_cars <- mtcars
class(classy_cars) <- c(class(classy_cars), "new_class")
class(classy_cars)
#> [1] "data.frame" "new_class"

classy_cars |> 
  select(mpg, cyl, disp)
#> Error in `eval_select_impl()`:
#> ! `x` must be a vector, not a <data.frame> object.

classy_cars |> 
  as_tibble() |>
  select(mpg, cyl, disp)
#> # A tibble: 32 × 3
#>      mpg   cyl  disp
#>    <dbl> <dbl> <dbl>
#>  1  21       6  160 
#>  2  21       6  160 
#>  3  22.8     4  108 
#>  4  21.4     6  258 
#>  5  18.7     8  360 
#>  6  18.1     6  225 
#>  7  14.3     8  360 
#>  8  24.4     4  147.
#>  9  22.8     4  141.
#> 10  19.2     6  168.
#> # ℹ 22 more rows
  
classy_cars |>
  mutate(
    foo = "foo"
  )
#> Error in `mutate()`:
#> ℹ In argument: `foo = "foo"`.
#> Caused by error in `vec_size()`:
#> ! `x` must be a vector, not a <data.frame> object.
  
classy_cars  |>
  filter(
    cyl > 10
  )
#> Error in `vec_slice()`:
#> ! `x` must be a vector, not a <data.frame> object.
  
classy_cars |>
  summarise(
    total_mpg = sum(mpg)
  )
#>   total_mpg
#> 1     642.9

Created on 2025-10-08 with reprex v2.1.1

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.0 (2024-04-24 ucrt)
#>  os       Windows 11 x64 (build 22621)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  English_Australia.utf8
#>  ctype    English_Australia.utf8
#>  tz       Australia/Brisbane
#>  date     2025-10-08
#>  pandoc   3.2 @ C:/PROGRA~1/RStudio/RESOUR~1/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date (UTC) lib source
#>  cli           3.6.5      2025-04-23 [1] CRAN (R 4.4.3)
#>  digest        0.6.37     2024-08-19 [1] CRAN (R 4.4.1)
#>  dplyr       * 1.1.4.9000 2025-10-08 [1] Github (tidyverse/dplyr@2f9a846)
#>  evaluate      1.0.3      2025-01-10 [1] CRAN (R 4.4.2)
#>  fastmap       1.2.0      2024-05-15 [1] CRAN (R 4.4.1)
#>  fs            1.6.5      2024-10-30 [1] CRAN (R 4.4.2)
#>  generics      0.1.4      2025-05-09 [1] CRAN (R 4.4.0)
#>  glue          1.8.0      2024-09-30 [1] CRAN (R 4.4.1)
#>  htmltools     0.5.8.1    2024-04-04 [1] CRAN (R 4.4.1)
#>  knitr         1.49       2024-11-08 [1] CRAN (R 4.4.2)
#>  lifecycle     1.0.4      2023-11-07 [1] CRAN (R 4.4.1)
#>  magrittr      2.0.4      2025-09-12 [1] CRAN (R 4.4.3)
#>  pillar        1.11.1     2025-09-17 [1] CRAN (R 4.4.0)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.4.1)
#>  R6            2.6.1      2025-02-15 [1] CRAN (R 4.4.2)
#>  reprex        2.1.1      2024-07-06 [1] CRAN (R 4.4.1)
#>  rlang         1.1.6.9000 2025-10-08 [1] Github (r-lib/rlang@b351966)
#>  rmarkdown     2.29       2024-11-04 [1] CRAN (R 4.4.2)
#>  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.4.1)
#>  tibble        3.3.0      2025-06-08 [1] CRAN (R 4.4.3)
#>  tidyselect    1.2.1      2024-03-11 [1] CRAN (R 4.4.1)
#>  vctrs         0.6.5.9000 2025-10-08 [1] Github (r-lib/vctrs@5b539da)
#>  withr         3.0.2      2024-10-28 [1] CRAN (R 4.4.0)
#>  xfun          0.50       2025-01-07 [1] CRAN (R 4.4.2)
#>  yaml          2.3.10     2024-07-26 [1] CRAN (R 4.4.1)
#> 
#>  [1] C:/Users/msmcba/r_library/user_library_v4.4
#>  [2] C:/Program Files/R/R-4.4.0/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

This goes against my expectations of how OOP should work. I would expect that if we have a method available for any of the object's classes, that method should be able to be dispatched to fulfill the generic call. The fact the object carries extra classes should not be a concern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions