Skip to content

Respect column names when "splicing" matrices in summarize()? #7690

@MichaelChirico

Description

@MichaelChirico

Compare:

penguins %>%
  summarize(.by = species, rbind(c(a = mean(bill_len), b = sd(bill_len))))
#     species rbind(c(a = mean(bill_len), b = sd(bill_len))).1 rbind(c(a = mean(bill_len), b = sd(bill_len))).2
# 1    Adelie                                               NA                                               NA
# 2    Gentoo                                               NA                                               NA
# 3 Chinstrap                                        48.833824                                         3.339256

penguins %>%
  summarize(.by = species, data.frame(as.list(c(a = mean(bill_len), b = sd(bill_len)))))
#     species        a        b
# 1    Adelie       NA       NA
# 2    Gentoo       NA       NA
# 3 Chinstrap 48.83382 3.339256

That is, when summarize() auto-converts a matrix into columns, we get ugly deparse()-inferred names, including totally losing a & b (now 1 and 2), but when it does so for a data.frame, we nicely get the nested data.frame's own column names.

When we have control over the function, we can prefer returning a data.frame/tibble, but sometimes we don't want to touch the function and the rbind() approach will look nicer (even if the same operation results under the hood).

The context here is related to #7689 -- trying to convert some plyr::ddply() into "standard" {dplyr} code as well & simply as I can, without needing to drastically refactor downstream package code if I can avoid it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions