I have been working on a dataset of folders and subfolders (folder -> subfolder - > file) I have trouble reading the first 10 folders of data. I have used the below code but it doesn't work. Please help
> for(i in seq_along(my_folders)){
my_data[[[i]]] = list.files(path = "~/dataset1", recursive = TRUE)
Below see problem with reading txt file in subfolder:
> for(i in 1:13){
current_dir = dirs[i]
lines = readLines(mydata[[i]])}
This gives error: Error in file(con, "r") : invalid 'description' argument
But outside of the loop this works:
> lines <- readLines(my_data[[1]])
CodePudding user response:
What do you think of that:
dirs = list.dirs(recursive = FALSE) # reads all directories/folders
mydata = list() # create empty list
for (i in 1:10) { # only takes the first 10 directories
current_dir = dirs[i]
mydata[[i]] = list.files(path = file.path("~/dataset1", current_dir), recursive = TRUE)
}
You only have to adapt your folder structure
CodePudding user response:
Use dir
to get a vector of file names, for example all .txt files in folder "f" and all it subfolders
files= dir("f",pattern = ".txt", full.names = T,recursive = T)
files
[1] "f/f1/f1_1/f1_1.txt"
[2] "f/f1/f1_2/f1_2.txt"
[3] "f/f2/f2_1/f2_1.txt"
[4] "f/f2/f2_2/f2_2.txt"
Then read them using readLines
lapply(files, readLines)