I have several .txt files that I am importing with the following code:
files = list.files(pattern="*.txt")%>%
map_df(~fread(.))
Each file has several rows and columns, and I want to add an id row with the file name. So if the files were called A-ML201.txt and A-YH248, etc., I would get the following. The file names need to repeat since each file has multiple rows:
ID col1 col2
A-ML201 2 67
A-ML201 4 29
A-ML201 1 90
A-YH248 23 2
A-YH248 12 17
A-YH248 8 57
I have tried a few solutions from this thread: How to import multiple .csv files at once?. But keep getting errors, maybe because they are .txt files? I tried replacing read.csv with read.table. I am new to this kind of thing so any help is greatly apppreciated!
CodePudding user response:
We could pass a named vector and then use .id
library(purrr)
library(dplyr)
library(stringr)
library(data.table)
files <- list.files(path = "path/to/your/folder",
pattern="\\.txt", full.names = TRUE)
names(files) <- str_remove(basename(files), "\\.txt")
map_dfr(files, fread, .id = 'ID')
CodePudding user response:
Suppose we have the reproducible input files generated in the Note at the end. Then get the list of files using Sys.glob
-- if these are the only .txt files you can use just "*.txt" -- and then run fread
over each using Map
and rbindlist
to bind them together.
library(data.table)
library(tools) # this comes with R
"A-*.txt" |>
Sys.glob() |>
Map(f = fread) |>
rbindlist(id = "ID") |>
transform(ID = file_path_sans_ext(ID))
giving:
ID col1 col2
1: A-ML201 2 67
2: A-ML201 4 29
3: A-ML201 1 90
4: A-YH248 23 2
5: A-YH248 12 17
6: A-YH248 8 57
The above code seems preferable since it only uses data.table plus tools which already comes with R but if you are ok with a mix of many -packages then this is how you would fix up the code in the question. Note that the regular expression there was incorrect.
library(data.table)
library(purrr)
library(tools)
"A-.*.txt" %>%
list.files(pattern = .) %>%
set_names(., file_path_sans_ext(.)) %>%
map_dfr(fread, .id = "ID")
Note
Lines1 <- "col1 col2
2 67
4 29
1 90"
Lines2 <- "col1 col2
23 2
12 17
8 57"
cat(Lines1, file = "A-ML201.txt")
cat(Lines2, file = "A-YH248.txt")