Home > Software design >  (R) Use dplyr::filter when the condition is a string
(R) Use dplyr::filter when the condition is a string

Time:03-25

I'm having an issue with dplyr::filter

I have a list of dataframes, such as :

df<- data.frame(a =1:3, b = 3:1, level= letters[1:3])
df2 <- data.frame(a =1:6, b = 21:26, level= letters[1:3])
listofdf <- list(df,df2)

I'm trying to create a function to select specific rows in my dataframe, this means that the argument I put in as the column name is necessarily a string.

Function looks something like this, it is meant to be used on a list of dataframes :

selectOTUlevel <- function(OTU,DATA,column){
  for (i in 1:length(DATA)){
    DATA <- DATA %>% filter(column == OTU)
  }
  return(DATA)
}

I've tried creating the same function another way :

selectOTU <- function(OTU,DATA,column){
  for (i in 1:length(DATA)){
    DATA[[i]] <- DATA[[i]][DATA[[i]]$column == OTU,]
  }
  return(DATA)
}

But I can't seem to solve this issue. I've tried this solution (https://www.r-bloggers.com/2020/09/using-dplyrfilter-when-the-condition-is-a-string/) but it doesn't work either.

Maybe if someone could enlighten me about what I'm doing wrong, I would be delighted !

CodePudding user response:

Take a look at this example: You can use get() in your function to reference it through a string

library(dplyr)

df <- data.frame(a = 1:3, b = 3:1, level= letters[1:3])    

filter_data <- function(data, column_name, filter_value){
  data %>% filter(get(column_name) == filter_value)
}

filter_data(data = df,
            column_name = "level", 
            filter_value = "b")

Result:

  a b level
1 2 2     b

CodePudding user response:

Your argument doesn't necessarily need to be a string to select a column name using dplyr functions. rlang::enquo(), then !! when you use the variable, allows you to pass it as an unquoted column name the same way you would do it directly to filter:

(modified function slightly)

library(tidyverse)

df<- data.frame(a =1:3, b = 3:1, level= letters[1:3])
df2 <- data.frame(a =1:6, b = 21:26, level= letters[1:3])
listofdf <- list(df,df2)

selectOTUlevel <- function(OTU,DATA,column){

  column <- rlang::enquo(column)
  D_OUT <- list()
  for (i in 1:length(DATA)){
    D_OUT[[i]] <- DATA[[i]] %>% filter(!!column == OTU)
  }
  
  return(D_OUT)
  
}


selectOTUlevel("a", listofdf, level)

#> [[1]]
#>   a b level
#> 1 1 3     a
#> 
#> [[2]]
#>   a  b level
#> 1 1 21     a
#> 2 4 24     a

Created on 2022-03-25 by the reprex package (v2.0.1)

  • Related