str_detect how to distinguish between a1 and a11-CodePudding

I try to find the presence of expresions "a1" and "a11" in abc object.

abc <- c("a1","a1|a11","a14","a11", "a11|a14", "a1|a3|a14", "a11|a16")

The first query finds everything, also a11, a14 etc

str_detect(abc,"a1")

Is there a way to distinguich somehow to find only the a1 and not a11, a12 etc.

CodePudding user response：

Use word boundaries to enclose the pattern you want.
The last regex uses "\<" and "\>" to give an example of word boundaries that match boundaries at the beginning and at the end of the pattern, respectively. "\\b" matches either side of it.

abc <- c("a1","a1|a11","a14","a11", "a11|a14", "a1|a3|a14", "a11|a16")

stringr::str_detect(abc,"\\ba1\\b") 
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

grepl("\\ba1\\b", abc)
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE
grepl("\\<a1\\>", abc)
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

^{Created on 2022-10-19 with reprex v2.0.2}

CodePudding user response：

You could use grepl with fixed = FALSE like this:

abc <- c("a1","a1|a11","a14","a11", "a11|a14", "a1|a3|a14", "a11|a16")
unlist(lapply(abc, \(x) grepl(x, 'a1', fixed = FALSE)))
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

^{Created on 2022-10-19 with reprex v2.0.2}