I apologize if there is already a solution available in another question. I have 500 csv files all with sequential names "name1, name2, etc." I need to run the same, simple linear regression on each file and save the coefficient outputs. The column names are also the same for each file and there is only a singular x & y variable.
I only know how to use
lm(tablename$columnY~tablename$columnX)
on individual files. I'm not sure how to set up a loop to run through each file.
Any help is appreciated
CodePudding user response:
Here is how I would run it using a function, (you can also use a for loop). the principle is to name all the files in question and go trough them one by one, saving the results of each model as we go.
#put all your files in one folder and point to the folder path
path <- "C:/Users/xxx/Desktop"
#list all the files, with directory attached
lst <- list.files(path, full.names = T)
#make a function or loop (i like functions to get structured output)
fun <- function(i){
#read each csv one at a time
dat <- read.csv(lst[i])
#make the model
mod <- lm(dat$columnY~dat$columnX)
#extract the information from the model (press view on any model and chose the desired values and hjust copy that code)
intcpt <- mod[["coefficients"]][["(Intercept)"]]
y <- mod[["coefficients"]][["columnX"]]
#set into dataframe, with the name of the file
out <- data.frame(lst[i], intcpt, y)
}
temp <- lapply(1:length(lst), fun) #run the model (will take the last thing stated in the fuction and make a list elemnt for each "loop")
results <- do.call("rbind",temp) #from list to dataframe