I have a vector of 8-character file names of the format
"/relative/path/to/folder/a(bc|de|fg)...[xy]1.sav"
where the brackets hold one of two-three known characters, and the '...' are three unknown characters. I want to match all character vectors that has the same unknown sequence XXX and sort into a list of character vectors.
I am not sure how to proceed on this. I am thinking about a way to extract the letters in the fourth to sixth position (...
), and put into a vector then use `grep to get all the files with the matching string.
E.g.
# Pseudo-code. Not functioning code, but sort of the thing I want to do
> char.extr <- str_extract(file.vector, !"a(bc|de|fg)...[xy]1.sav")
> char.extr
"JKL", "MNO" ,"PQR" ...
# Use grep and lapply to put matched strings into list
> path.list <- lapply(char.extr, grep, file.vector)
> path.list
1. "/relative/path/to/folder/abcJKLx1.sav"
"/relative/path/to/folder/adeJKLy1.sav"
2. "/relative/path/to/folder/afgMNOx1.sav"
"/relative/path/to/folder/abcMNOy1.sav"
CodePudding user response:
Since we know the name structure, I'd imaging extracting the 3 letter substring and then using split
to get individual lists is what you're looking for.
split(path.list, substr(basename(path.list), 4, 6))