Home > other >  Regex for matching a phrase followed by certain numbers followed by certain string
Regex for matching a phrase followed by certain numbers followed by certain string

Time:12-03

Another regex problem I can't solve. :/

I have the following input vector:

x <- c("test", paste0(rep(paste0("Q18r", c(1:14, 997)), each = 2), c("c1", "c2")))

I now want to do the following match:

  • Find the phrase "Q18r" that is
  • followed by any number of 1, 2, 3, 10, 11, 12, 997 that is
  • followed by the phrase "c1"

The problem is that I can't figure out how to add the "c1" at the end, i.e. I get it working at least for the first two matches with:

stringr::str_subset(x, "Q18r(?=[1-3]|1[1-2])|(997)")

I tried a lot of different c1 matches, i.e. (?=c1), only c1, (c1) etc. but nothing works.

Any ideas?

CodePudding user response:

You can use

stringr::str_subset(x, "Q18r(?:[1-3]|1[1-2]|997)c1")
[1] "Q18r1c1"   "Q18r2c1"   "Q18r3c1"   "Q18r11c1"  "Q18r12c1"  "Q18r997c1"

Regex details:

  • Q18r - a fixed Q18r string
  • (?:[1-3]|1[1-2]|997) - a non-capturing group matching 1, 2, 3, 11, 12 or 997
  • c1 - a c1 string.
  • Related