Skip to content

make more data-frame-ish #24

@piccolbo

Description

@piccolbo

This is an umbrella issues to make dplyr-spark tables more data-frameish. Standard procedure should be to open an issue for each of the specific points and mention this one.

  • implement sample_n and sample_frac #2 sampling
  • slice #3 slicing
  • nrow. Returns NA instead of the actual count, motivation being that
  • summary. No summary in dplyr, actually treats a table as a list. Sad
  • create from file. Like a read.table or some such. Maybe an extension to copy_to, based on LOAD INPATH
  • dropping of rownames in copy_to. dplyr boycotts rownames (I understand that) but I'd prefer creating a col rather than dropping the information altogether. The party line is: don't use rownames, use a col. Well, we should lead by example and copy rownames to a col
  • names: should it return the same as colnames
  • add more, we want a complete data frame illusion

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions