Home > Back-end >  regex python for matching exact number of characters with overlapping matches
regex python for matching exact number of characters with overlapping matches

Time:03-03

For example, I have a a string
t = "abftababcd"
I want to match a pattern
p= "abXX"
where X is a wild card and matches any character. For the given p and t I want to get
['abft', 'abab','abcd']
For p="XXab" the expected output is ['ftab', 'abab']

Now i want to match all the overlapping matches in string t. I have tried replacing X with
"ab\w{no.of X}"
This doesn't give overlapping matches. So i tried lookahead assertion as
"ab(?=(\w{no.of X}))"
Then this only gives me the matches with the pattern "ab". Also, does lookahead work if X is present in the beginning XXab of the string or in the middleaXXXb?

CodePudding user response:

You can match

(?=(ab..))

Capture group 1 contains each match, which for the string

"abftababcd"

are "abft", "abab" and "abcd".

The expression consists of a positive lookahead that asserts that the current position in the string is followed immediately by 'ab' followed by any two characters, with those four characters being saved to capture group 1.

Think of positions as locations between successive characters or before the first character in the string.

Demo

  • Related