Home > Software engineering >  str_detect how to distinguish between a1 and a11
str_detect how to distinguish between a1 and a11

Time:10-20

I try to find the presence of expresions "a1" and "a11" in abc object.

abc <- c("a1","a1|a11","a14","a11", "a11|a14", "a1|a3|a14", "a11|a16")

The first query finds everything, also a11, a14 etc

str_detect(abc,"a1") 

Is there a way to distinguich somehow to find only the a1 and not a11, a12 etc.

CodePudding user response:

Use word boundaries to enclose the pattern you want.
The last regex uses "\<" and "\>" to give an example of word boundaries that match boundaries at the beginning and at the end of the pattern, respectively. "\\b" matches either side of it.

abc <- c("a1","a1|a11","a14","a11", "a11|a14", "a1|a3|a14", "a11|a16")

stringr::str_detect(abc,"\\ba1\\b") 
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

grepl("\\ba1\\b", abc)
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE
grepl("\\<a1\\>", abc)
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

Created on 2022-10-19 with reprex v2.0.2

CodePudding user response:

You could use grepl with fixed = FALSE like this:

abc <- c("a1","a1|a11","a14","a11", "a11|a14", "a1|a3|a14", "a11|a16")
unlist(lapply(abc, \(x) grepl(x, 'a1', fixed = FALSE)))
#> [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE FALSE

Created on 2022-10-19 with reprex v2.0.2

  • Related