I have regex to match repeating sequence of word 'abc' separated by spaces.
re.search("(abc \s){3}", "abc abc abc ")
I want to upgrade it to match sequences with 0 or 1 inserted words(words can be different with different lengths). Examples:
"abc abc abc " #match
"abc defgh abc abc " #match
"abc xyz abc abc " #match
"abc abc ghi abc " #match
"abc ghi abc def abc " #not match
Number of repetitions may also vary, it should be applicable to multiple repetitions.
CodePudding user response:
Try this:
/abc \s(([a-z] \sabc)|(abc\s[a-z] )|abc)\sabc \s/
CodePudding user response:
The pattern you're looking for is (abc\s)(?:(?:\1\1)|(?:\w \s\1\1)|(?:\1\w \s))
import re
pattern = r"(abc\s)(?:(?:\1\1)|(?:\w \s\1\1)|(?:\1\w \s))"
bool(re.search(pattern, "abc abc abc ")) # True
bool(re.search(pattern, "abc defgh abc abc ")) # True
bool(re.search(pattern, "abc xyz abc abc ")) # True
bool(re.search(pattern, "abc abc ghi abc ")) # True
bool(re.search(pattern, "abc ghi abc def abc ")) # False
CodePudding user response:
You may use this regex to match:
\b(?!(abc (?!abc )\w ){2})(?:abc (?:\w )?){3}
RegEx Details:
\b
: Word boundary(?!(abc (?!abc )\w ){2})
: Fail the match if there are more than one non-abc
words(?:
: Start a non-capture groupabc
: Matchabc
followed by 1 spaces(?:
: Start inner non-capture group\w
: Match 1 word characters
)?
: End inner non-capture group. Match this group 0 or 1 times
){3}
: End outer non-capture group. Repeat this group 3 times