Home > Mobile >  reading from a list of files located in a folder in R
reading from a list of files located in a folder in R

Time:12-03

I have to download a lot of data in mass from the internet, and I don't want this to crowd my main directory so much, so I like to move it to a /data folder. I make this data into a list, then move that entire list into that folder. However, I then struggle to do analyses with sapply() and other functions upon this entire list of files once it is located in the folder. I can't find any argument within sapply() that takes a path or anything, so I was wondering how I can get around this. Below is some code demonstrating this problem.

library(dplyr)
library(fs)

mtcars %>% write.csv("data_1.csv")
DNase %>% write.csv("data_2.csv")
iris %>% write.csv("data_3.csv")

my_list <- list.files(pattern = "data_")

fs::file_move(my_list, new_path = "MYDIRECTORY/data")

sapply(my_list, read.csv)

Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
  In file(file, "rt") :
  cannot open file 'data_1.csv': No such file or directory

CodePudding user response:

you could simplify by not using data.table::fread, but it's fast and if you have a lot of files in the folder, it's worth keeping it.

library(data.table)
library(dplyr)
library(purrr)


path_to_folder = "MYDIRECTORY/data"

df <- list.files(path=path_to_folder,pattern = "*.csv",full.names = T) %>% 
  map_df(~fread(.,stringsAsFactors=F,check.names=T,strip.white=T))

CodePudding user response:

When you use the sapply function, you could do this:

all_data <- sapply(my_list, function(x) { 
  read.csv(file = paste0("./data/", x))
})

And then, if you want to append all the .csv in just one data table, you can do this:

library(data.table)
rbindlist(all_data, fill = TRUE)
  • Related