Home > Software engineering >  How to pass column names in function to carry out filter, select and sort actions within a function
How to pass column names in function to carry out filter, select and sort actions within a function

Time:03-24

I am trying to carry out the filter, select and arrange actions on a data frame by defining the function.

Below is the code i am trying to replicate by a function:

mtcars %>%  
  filter(disp > 150) %>%  
  select(disp, hp) %>%  
  arrange(hp)

The function i have created is as below:

process_data <- function(df, col_1, col_2){
  df %>%  filter(col_1 > 150) %>%  
    select(col_1, col_2)
}

process_data(df = mpg, col_1 = "disp", col_2 = "hp")

However when i try to execute the i get the below error:

Error: Can't subset columns that don't exist. x Column disp doesn't exist.

Tried multiple ways of passing the column name, but it isnt working.

CodePudding user response:

We need to convert to symbol and evaluate (!!) if we pass string as input

library(dplyr)
process_data <- function(df, col_1, col_2){
   col_1 <- rlang::ensym(col_1)
   col_2 <- rlang::ensym(col_2)
  df %>%  filter(!!col_1 > 150) %>%  
    select(!!col_1, !!col_2)
}

-testing

process_data(df = mtcars, col_1 = "disp", col_2 = "hp")
                     disp  hp
Mazda RX4           160.0 110
Mazda RX4 Wag       160.0 110
Hornet 4 Drive      258.0 110
Hornet Sportabout   360.0 175
Valiant             225.0 105
Duster 360          360.0 245
Merc 280            167.6 123
Merc 280C           167.6 123
Merc 450SE          275.8 180
Merc 450SL          275.8 180
Merc 450SLC         275.8 180
Cadillac Fleetwood  472.0 205
Lincoln Continental 460.0 215
Chrysler Imperial   440.0 230
Dodge Challenger    318.0 150
AMC Javelin         304.0 150
Camaro Z28          350.0 245
Pontiac Firebird    400.0 175
Ford Pantera L      351.0 264
Maserati Bora       301.0 335

CodePudding user response:

Another solution using any_of:

process_data <- function(df, col_1, col_2){
    df %>% 
        filter(col_1 > 150) %>% 
        select(any_of(c(col_1, col_2)))
}
  •  Tags:  
  • r
  • Related