I have a question that concerns reading files into R. I have a folder with many files that I would like to have in a dataframe in R to work with. What is the most efficient way to read in a large amount of data (over 1000 files)? My code is below, it has been running for a day now and still not all files are read in.
data = data.frame()
for(file in files) {
path = paste0("Data/",file,".RData")
if(file.exists(path)) {
load(path) # as file_data
data = dplyr::bind_rows(data, file_data)
}
}
CodePudding user response:
You can list all the files and read them into a list. Then bind them at the end.
my_files <- list.files(path = "Data/", pattern = "*.RData", full.names = T)
all_data <- lapply(my_files, load, .GlobalEnv)
bind_rows(mget(unlist(all_data)))
A better solution than using mget
is defining a new.env
to load the files there and then read them into a list:
my_files <- list.files(path = "Data/", pattern = "*.RData", full.names = T)
temp <- new.env()
lapply(my_files, load, temp)
all_data <- as.list(temp)
rm(temp)
bind_rows(all_data)
CodePudding user response:
You can use the package fs
to make a list of the files then use map_dfr
. I use this for csv and excel files but I am presuming it should work the same with load
for .Rdata files.
library(fs)
library(tidyverse)
file_list <- dir_ls("Data")
data <- file_list %>%
map_dfr(load)