I have an issue where I want to extract a pattern from a vector of strings, ie extract
c("TAG a", "TAG b", "TAG c")
from c("TAG a", "TAG b-3", "TAG c 3")
So far I've tried:
str_vec <- c("TAG a", "TAG b-3", "TAG c 2", "2 TAG d")
stringr::str_extract(str_vec, "TAG .*(?=[\\ \\-])")
Which returns TAG b and c correctly, but doesn't extract TAG a or d.
If I try
stringr::str_extract(str_vec, "TAG .*(?=[\\ \\-]|$)")
TAG a and d are returned correclty, but $
seems to override /- so TAG b and c are returned with their suffixes still attached.
CodePudding user response:
You need
str_vec <- c("TAG a", "TAG b-3", "TAG c 2", "2 TAG d")
stringr::str_extract(str_vec, "TAG [^ -]*")
# => [1] "TAG a" "TAG b" "TAG c" "TAG d"
Details:
TAG
- a fixed string[^ -]*
- zero or more chars other than-
and
See the regex demo and the R demo.
CodePudding user response:
How about:
library(stringr)
str_extract(str_vec, "TAG [a-z]")
Output:
[1] "TAG a" "TAG b" "TAG c" "TAG d"