I have two dataframes and a function, which works when I use it on a single variable.
library(tidyverse)
iris1<-iris
iris2<-iris
iris_fn<-function(df,species_type){
df1<-df%>%
filter((Species==species_type))
return(df1)}
new_df<-iris_fn(df=iris1, species_type="setosa")
I want to pass a vector of variables to the function with the expected output being a list of dataframes (3), one filtered to each variable, for which I have been experimenting using lapply:
variables<-c("setosa","versicolor","virginica")
new_df<-lapply(df=iris1, species_type="setosa", FUN= iris_fn)
The error message is Error in is.vector(X) : argument "X" is missing, with no default
which I dont understand because I have stated the variables of the function and what the name of the function is.
Can anyone suggest a solution to get the desired output? I essentially need a version of lapply or purrr function that will allow a dataframe and a vector as inputs.
CodePudding user response:
lapply
expects an argument called X
as the main input. You could re-write it so that the function expects X instead of species_type
e.g.
iris_fn <- function(df, X){
df1 <- df %>% filter((Species==X))
return(df1)
}
variables <- c("setosa", "versicolor", "virginica")
new_df <- lapply(X=variables, FUN=iris_fn, df=iris1)
EDIT: Alternatively to avoid using X, you need the first argument of the function to match the lapply input e.g.
iris_fn <- function(species_type, df){
df1 <- df %>% filter((Species==species_type))
return(df1)
}
new_df <- lapply(variables, FUN=iris_fn, df=iris1)
Check out the split
function for a convenient way to split a data.frame to a list e.g. split(iris, f=iris$Species)
CodePudding user response:
From ?lapply
: lapply(X, FUN, ...)
, by naming all your arguments there's no X
that could be passed to function as the first arg.
Try something like this:
library(dplyr)
iris1<-iris
# note the changes arg. order
iris_fn<-function(species_type, df){
df1<-df%>%
filter((Species==species_type))
return(df1)}
variables<-c("setosa","versicolor","virginica")
new_df_list <-lapply(variables, iris_fn, df=iris1 )
Or with just an anonymous function:
new_df_list <-lapply(variables, \(x) filter(iris1, Species == x))
As you already use Tidyverse, perhaps with purrr::map()
instead:
library(purrr)
new_df_list <- map(variables, ~ filter(iris1, Species == .x))
Created on 2022-11-14 with reprex v2.0.2