I'm having an issue with dplyr::filter
I have a list of dataframes, such as :
df<- data.frame(a =1:3, b = 3:1, level= letters[1:3])
df2 <- data.frame(a =1:6, b = 21:26, level= letters[1:3])
listofdf <- list(df,df2)
I'm trying to create a function to select specific rows in my dataframe, this means that the argument I put in as the column name is necessarily a string.
Function looks something like this, it is meant to be used on a list of dataframes :
selectOTUlevel <- function(OTU,DATA,column){
for (i in 1:length(DATA)){
DATA <- DATA %>% filter(column == OTU)
}
return(DATA)
}
I've tried creating the same function another way :
selectOTU <- function(OTU,DATA,column){
for (i in 1:length(DATA)){
DATA[[i]] <- DATA[[i]][DATA[[i]]$column == OTU,]
}
return(DATA)
}
But I can't seem to solve this issue. I've tried this solution (https://www.r-bloggers.com/2020/09/using-dplyrfilter-when-the-condition-is-a-string/) but it doesn't work either.
Maybe if someone could enlighten me about what I'm doing wrong, I would be delighted !
CodePudding user response:
Take a look at this example: You can use get() in your function to reference it through a string
library(dplyr)
df <- data.frame(a = 1:3, b = 3:1, level= letters[1:3])
filter_data <- function(data, column_name, filter_value){
data %>% filter(get(column_name) == filter_value)
}
filter_data(data = df,
column_name = "level",
filter_value = "b")
Result:
a b level
1 2 2 b
CodePudding user response:
Your argument doesn't necessarily need to be a string to select a column name using dplyr
functions. rlang::enquo()
, then !!
when you use the variable, allows you to pass it as an unquoted column name the same way you would do it directly to filter
:
(modified function slightly)
library(tidyverse)
df<- data.frame(a =1:3, b = 3:1, level= letters[1:3])
df2 <- data.frame(a =1:6, b = 21:26, level= letters[1:3])
listofdf <- list(df,df2)
selectOTUlevel <- function(OTU,DATA,column){
column <- rlang::enquo(column)
D_OUT <- list()
for (i in 1:length(DATA)){
D_OUT[[i]] <- DATA[[i]] %>% filter(!!column == OTU)
}
return(D_OUT)
}
selectOTUlevel("a", listofdf, level)
#> [[1]]
#> a b level
#> 1 1 3 a
#>
#> [[2]]
#> a b level
#> 1 1 21 a
#> 2 4 24 a
Created on 2022-03-25 by the reprex package (v2.0.1)