I was wondering how to apply a function to a data frame, for example if I want a function that returns the mean of a certain column (for example column 13 in every data frame I put in the function) how can I specify that instruction and putting the data frame in the place of the argument.
mean <- function (argument) {}
mean <- function(data frame) {} #this is what I try to do
The thing is that I have to get the average of 13 tables but I only use 1 column of each table and I have to gather them in a single data frame, which I think could be easier if I set a function for each table (data frame) that gives me the unique result.
CodePudding user response:
Treat the input of the function exactly as you would any other dataframe. The below is a general layout of what you are looking for:
df_mean <- function(df) {
mean(df$desiredColumnName)
}
lapply(dataframe_list, df_mean)
Of course, it assumes that you are looking for the same column name in each dataframe, and that your dataframes are in a list.
Alternatively, as this is quite simple an anonymous function works as well:
lapply(dataframe_list, function(x) mean(x$desiredColumnName))
If you have your files stored as csvs in one directory, you can read them into a list like this:
dataframe_list <- lapply(list.files(directory_path, recursive = TRUE, full.names = TRUE), read.csv)