-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Labels
Description
This is an umbrella issues to make dplyr-spark tables more data-frameish. Standard procedure should be to open an issue for each of the specific points and mention this one.
- implement sample_n and sample_frac #2 sampling
- slice #3 slicing
- nrow. Returns NA instead of the actual count, motivation being that
- summary. No summary in dplyr, actually treats a table as a list. Sad
- create from file. Like a read.table or some such. Maybe an extension to copy_to, based on LOAD INPATH
- dropping of rownames in copy_to. dplyr boycotts rownames (I understand that) but I'd prefer creating a col rather than dropping the information altogether. The party line is: don't use rownames, use a col. Well, we should lead by example and copy rownames to a col
- names: should it return the same as colnames
- add more, we want a complete data frame illusion