-
Notifications
You must be signed in to change notification settings - Fork 31
Description
I want to use the plyrmr package while keeping my existent code written in dplyr and thus I want to use the "magic.wand" function. I am using the "mtcars" dataset for simplicity and the the path to it is "/user/sgerony/mtcars" on the HDFS (Hadoop Distributed File System).
The block of code contains base functions but also dplyr functions and this is my code:
magic.wand(rename,TRUE)
filename <- "/user/sgerony/mtcars"
complex.function = function(x){
x$carb <- x[,ncol(x)]*2
x$carb <- x$carb+2
x <- as.data.frame(rename(x, lol=carb))
return(x)
}
magic.wand(complex.function)
# does NOT work
input(filename) %|% complex.function()
Question 1: Is this the right way to do? meaning do I have to call a first time the magic.wand for the dplyr functions and then for the bloc of code?
Question 2: Why can't I call the magic.wand function like this?
magic.wand(dplyr::rename,TRUE)
Isn't it necessary to be sure that we are not using functions contained in several libraries?
Question 3: Why do I have to put "TRUE" as a second argument of the first magic.wand call and not the last one?
Question 4: What if my block of code is using the dplyr piping operator? namely:
complex.function = function(x){
x$carb <- x[,ncol(x)]*2
x$carb <- x$carb+2
x <- as.data.frame(x %>% rename(lol=carb))
return(x)
}
Should I just replace the "%>%" by the plyrmr piping operator? Namely "%|%"?
Question 5: Should I call the magic.wand function on dplyr functions that are equivalent to plyrmr functions like "group_by"?
Question 6: Why do I have an error when using as.POSIXct?
magic.wand(mutate,TRUE)
filename <- "/user/sgerony/mtcars"
complex.function = function(x){
x$carb <- x[,ncol(x)]*2
x$carb <- x$carb+2
x <- as.data.frame(mutate(x,date.time=as.POSIXct("2014-01-01 03:15")))
return(x)
}
magic.wand(complex.function)
#Works
mtcars %|% complex.function()
# does NOT work
input(filename) %|% complex.function()
I realize this is a big question, so thanks for trying to help