Request for a group_subset() function

Hi, so I frequently find myself attempting to subset a particular group from a grouped dataframe. Usually for troubleshooting purposes of some sort. There's already a set of group_ helper functions which I usually try to inspect for this task.  You can make these work to select a group or call filter() and manually filter down to a single group, but either way it's a bit tedious. Especially when you're looking to quickly grab a random group or two for dev/debugging purposes. The most efficient way I can find to do this is:
`grouped_df[group_rows(grouped_df)[[1]],]`

This will subset the data from the first group. However, this is a bit tedious and difficult to remember. Plus, it doesn't work well with pipes as the data frame must be called twice (and pipes don't play well with subsetting in the first place). For demonstration, the piped equivalent is:
`grouped_df %>% group_rows() %>% .[[1]] %>% grouped_df[.,]`

Both of these are ugly and hard to remember so I think it would be nice to have a helper function specifically for this purpose. It could be called group_subset() or group_select(), tho the latter could be construed with select() (even though groups are row-based, but I can see why one might want to avoid it). Heck, I would actually argue for replacing group_data(), as you'd be forgiven for thinking that's what group_data() is for. But it's not.. it returns row numbers not data, which is misleading imo. In fact group_data() is so similar to group_rows() that I would argue they're basically redundant and group_data() could simply be repurposed.

Anyways, my envisioned syntax to replace the above calls is:
`grouped_df %>% group_subset(1)`

This would be a really nice clean solution to return a single group subset via a group index. If you're highly adverse to adding new functions or making breaking changes, then group_data() could at least be modified to return a data column. Then you could do
`grouped_df %>% group_data() %>% slice(1) %>% pull(.data)`

This would at least make group_data() true to it's name and be an improvement. But I still like the dedicated function option better (eg group_subset) and it seems reasonable given there's already a suite of helper functions. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request for a group_subset() function #7625

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Request for a group_subset() function #7625

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions