A different program I am using (Raven Pro) results in hundreds of .txt files that include nine variables with headers. I also need the file name that each line is being pulled from.
I am using stringr::str_extract(names
in order to get a file name thrown into a dataframe with rbindlist. My problem is that I only want a portion of the file name included.
Here's an example of one of my file names -
BIOL10_20201206_180000.wav.Table01.txt
so if I do ("\\d "))
to try and get numbers it only picks up the 10 before the underscore, but the portion of the file name I need is 20201206_180000
Any help to get around this is appreciated :)
library(plyr)
myfiles <- list.files(path=folder, pattern="*.txt", full.names = FALSE)
dat_tab <- sapply(myfiles, read.table, header= TRUE, sep = "\t", simplify = FALSE, USE.NAMES = TRUE)
names(dat_tab) <- stringr::str_extract(names(dat_tab), ("\\d "))
binded1 = rbindlist(dat_tab, idcol = "files", fill = TRUE)
ended up with file name coming in as "10" from the file name "BIOL10_20201206_180000.wav.Table01.txt"
CodePudding user response:
You can specify the length:
library(stringr)
str_extract(x, "\\d{8}_\\d{6}")
# "20201206_180000"
CodePudding user response:
A couple other options:
x <- "BIOL10_20201206_180000.wav.Table01.txt"
#option 1
sub("^.*?(\\d _\\d ).*$", "\\1", x)
#> [1] "20201206_180000"
#option 2
stringr::str_extract(x, "(?<=_)\\d _\\d ")
#> [1] "20201206_180000"