-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Description
With the introduction of .by, we no longer sort group keys automatically. There are a whole host of good reasons for this as outlined here #5664 (comment), and I am mostly confident this is the right long term default for dplyr.
However, I am empathetic to the fact that users do often like to see their summary results sorted in ascending order. Right now, our recommendation is:
df %>%
summarise(..., .by = c(a, b, c)) %>%
arrange(a, b, c) # could also come before `summarise()`This is nice because you get the full power of arrange() including desc() and .locale.
I think we should consider a .sort argument like:
df %>%
summarise(..., .by = c(a, b, c), .sort = TRUE).sort = FALSEwould be the default for reasons mentioned above.- We'd document this as the 100% backwards compatible way to transition from
group_by()to.by(even though most of the time the ordering isn't important). - You must accept that you get ascending order and the C locale. That makes it compatible with
group_by(). If you need anything fancier, callarrange(). - I do like that you won't have to repeat the group names.
- Obviously
.sort = TRUEerrors on unorderable types like clock's year-month-weekday. - This would probably only be an argument for the
.data.framemethod, as opposed to the generic, because dbplyr probably won't want to enforce a sort order? Uncertain.
Basically, this leaves the idea of a groupby + summarise operation theoretically pure (because it shouldn't require orderable keys), but also gives users a convenient way to optionally opt in to sorted results.
There are 3 functions that would get this argument:
summarise()reframe()slice_sample()(goes withslice()andslice_head/tail/min/max()should act like afilter()not areframe()#6662)
The following would not get .sort because they aren't about row ordering:
filter()mutate()slice()andslice_min/max/head/tail()(afterslice()andslice_head/tail/min/max()should act like afilter()not areframe()#6662 is changed)