I have a text example like
0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87
I want to detect the consecutive sequences that start 0s.
So, the expected output should be 0s11 0s12 0s33
, 0sgfh 0s1 0s22 0s87
I tried using regex
(0s\w )
but that would detect each 0s11
, 0s12
, 0s33
, etc. individually.
Any idea on how to modify the pattern?
CodePudding user response:
To get those 2 matches where there are at least 2 consecutive parts:
\b0s\w (?:\s 0s\w )
Explanation
\b
A word boundary to prevent a partial word match0s\w
Matchos
and 1 word chars(?:\s 0s\w )
Repeat 1 or more times whitespace chars followed by0s
and 1 word chars
If you also want to match a single occurrence:
\b0s\w (?:\s 0s\w )*
Note that \w
matches 1 or more word characters so it would not match only 0s
CodePudding user response:
Should be doable with re.findall()
. Your pattern was correct! :)
import re
testString = "0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87"
print(re.findall('0s\w', testString))
['0s11', '0s12', '0s33', '0sgfh', '0s1', '0s22', '0s87']
Hope this helps!