Home > OS >  Detecting a repeated sequence with regex
Detecting a repeated sequence with regex

Time:12-01

I have a text example like

0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87

I want to detect the consecutive sequences that start 0s.

So, the expected output should be 0s11 0s12 0s33, 0sgfh 0s1 0s22 0s87

I tried using regex

(0s\w )

but that would detect each 0s11, 0s12, 0s33, etc. individually.

Any idea on how to modify the pattern?

CodePudding user response:

To get those 2 matches where there are at least 2 consecutive parts:

\b0s\w (?:\s 0s\w ) 

Explanation

  • \b A word boundary to prevent a partial word match
  • 0s\w Match os and 1 word chars
  • (?:\s 0s\w ) Repeat 1 or more times whitespace chars followed by 0s and 1 word chars

Regex demo

If you also want to match a single occurrence:

\b0s\w (?:\s 0s\w )*

Regex demo

Note that \w matches 1 or more word characters so it would not match only 0s

CodePudding user response:

Should be doable with re.findall(). Your pattern was correct! :)

import re
testString = "0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87"
print(re.findall('0s\w', testString))

['0s11', '0s12', '0s33', '0sgfh', '0s1', '0s22', '0s87']

Hope this helps!

  • Related