-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Closed
Description
Currently this is the message dplyr emits for summarize() after group_by() with multiple variables.
library(dplyr)
mtcars |>
group_by(vs, am) |>
summarize(mean_mpg = mean(mpg))
#> `summarise()` has grouped output by 'vs'. You can override using the `.groups`
#> argument.
#> # A tibble: 4 × 3
#> # Groups: vs [2]
#> vs am mean_mpg
#> <dbl> <dbl> <dbl>
#> 1 0 0 15.0
#> 2 0 1 19.8
#> 3 1 0 20.7
#> 4 1 1 28.4Created on 2024-01-29 with reprex v2.0.2
I think this message is still confusing and would be more clear if the grouping message was about the output and it explicitly stated .groups is an argument in summarize(), e.g.,
The output is grouped by `vs`. You can specify grouping structure of the output using the `.groups` argument in `summarize()`.
If going this route some things to keep in mind:
- Maybe "result" instead of "output" in two places in the message, or change the description of the
.groupsargument to say "Grouping structure of the output." Basically, we should match what we're calling the "thing" that the function spits out. - It would be a nice-to-have if US/UK spelling of the function in the message matched what the spelling in the code that generates the message.
An alternative suggestion by @DavisVaughan was
summarize() has computed your expressions grouped by (foo, bar), and has regrouped the output by (foo).
I think this is an improvement over the current message too, but I'd suggest going with something simpler like the one above.
janxkoci
Metadata
Metadata
Assignees
Labels
No labels