What is the best way to nest a function operation on a data frame in another function? I want to write a function which takes a data frame and a column name and then does something on that column and returns the modified data frame like below:
library(dplyr)
func = function(df, col){
df = df %>% mutate(col = col 1)
return(df)
}
new_df = func(cars, 'speed')
But this raises an error because col is not a string in the function and I am not sure how to replace it with a function input argument other than strings. Any idea how to fix this with minimum effort?
CodePudding user response:
To use dplyr
code in function you have to use non-standard evaluation. In this case using {{}}
in the function would do.
library(dplyr)
func = function(df, col) {
df = df %>% mutate({{col}} := {{col}} 1)
return(df)
}
new_df = func(cars, speed)
head(cars)
# speed dist
#1 4 2
#2 4 10
#3 7 4
#4 7 22
#5 8 16
#6 9 10
head(new_df)
# speed dist
#1 5 2
#2 5 10
#3 8 4
#4 8 22
#5 9 16
#6 10 10
You can read more about non-standard evaluation here https://dplyr.tidyverse.org/articles/programming.html
CodePudding user response:
I think you mean that you want col to be numeric? so that you can 1
. If that is correct see below.
library(dplyr)
func = function(df, col){
df = df %>% mutate(col = as.numeric(col) 1)
return(df)
}
new_df = func(cars, 'speed')
Another alternative would be to use the index of the column name as the function argument, instead of the string name of the column.
That might look something like
library(dplyr)
func = function(df, col_index){
col_name <- colnames(df)[col_index]
df = df %>% mutate(col_name = col_name 1)
return(df)
}
new_df = func(cars, 2)