I am writing a package with several functions that accept a dataframe object as well as the the dataframe's column names as arguments.
Here is a simplified example:
func = function(df,vars){
head(df[,vars])
}
#column args as strings
func(mtcars,c("mpg","cyl"))
Instead of supplying the column names as strings, I would like the function to accept (and suggest/auto-complete) the column names like in dplyr functions.
#dplyr-style args
func(mtcars, mpg, cyl)
#which doesnt work because mpg and cyl don't exist as objects
I considered using the ...
as function arguments but this would still involve using strings.
Any help would be appreciated.
CodePudding user response:
You can use
subset(df, select = item)
You should check out Advanced R by Hadley Wickham which is extremely interesting, if somewhat, well, advanced. In particular:
20.4 Data masks
In this section, you’ll learn about the data mask, a data frame where the evaluated code will look first for variable definitions. The data mask is the key idea that powers base functions like with(), subset() and transform(), and is used throughout the tidyverse in packages like dplyr and ggplot2.
CodePudding user response:
A possible solution, using dplyr
:
library(dplyr)
func = function(df,...){
df %>%
select(...) %>%
head
}
func(mtcars, mpg, cyl)
#> mpg cyl
#> Mazda RX4 21.0 6
#> Mazda RX4 Wag 21.0 6
#> Datsun 710 22.8 4
#> Hornet 4 Drive 21.4 6
#> Hornet Sportabout 18.7 8
#> Valiant 18.1 6
func(mtcars, mpg)
#> mpg
#> Mazda RX4 21.0
#> Mazda RX4 Wag 21.0
#> Datsun 710 22.8
#> Hornet 4 Drive 21.4
#> Hornet Sportabout 18.7
#> Valiant 18.1
Or in base R
:
func = function(df,...){
head(df[, sapply(substitute(...()), deparse)])
}
func(mtcars, mpg, cyl)
#> mpg cyl
#> Mazda RX4 21.0 6
#> Mazda RX4 Wag 21.0 6
#> Datsun 710 22.8 4
#> Hornet 4 Drive 21.4 6
#> Hornet Sportabout 18.7 8
#> Valiant 18.1 6
func(mtcars, mpg)
#> [1] 21.0 21.0 22.8 21.4 18.7 18.1