I have to download a lot of data in mass from the internet, and I don't want this to crowd my main directory so much, so I like to move it to a /data folder. I make this data into a list, then move that entire list into that folder. However, I then struggle to do analyses with sapply() and other functions upon this entire list of files once it is located in the folder. I can't find any argument within sapply() that takes a path or anything, so I was wondering how I can get around this. Below is some code demonstrating this problem.
library(dplyr)
library(fs)
mtcars %>% write.csv("data_1.csv")
DNase %>% write.csv("data_2.csv")
iris %>% write.csv("data_3.csv")
my_list <- list.files(pattern = "data_")
fs::file_move(my_list, new_path = "MYDIRECTORY/data")
sapply(my_list, read.csv)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'data_1.csv': No such file or directory
CodePudding user response:
you could simplify by not using data.table::fread
, but it's fast and if you have a lot of files in the folder, it's worth keeping it.
library(data.table)
library(dplyr)
library(purrr)
path_to_folder = "MYDIRECTORY/data"
df <- list.files(path=path_to_folder,pattern = "*.csv",full.names = T) %>%
map_df(~fread(.,stringsAsFactors=F,check.names=T,strip.white=T))
CodePudding user response:
When you use the sapply
function, you could do this:
all_data <- sapply(my_list, function(x) {
read.csv(file = paste0("./data/", x))
})
And then, if you want to append all the .csv in just one data table, you can do this:
library(data.table)
rbindlist(all_data, fill = TRUE)