I am working with java regexes, but I guess the principles apply for every regex.
I have these requirements for the segment a regex should match:
- have at least 3 times 'a'
- have at least 3 times 'b'
- occurrences of 'a' and 'b' can be in any order
Inspired by this post I came up with the following regex (regex101):
(?=([b]*[a]){3})(?=([a]*[b]){3})[ab]
I am struggling with adding a new requirement:
- Match if there is no or at least 3 'c'
- as above, 'c' can occur anywhere in the segment
Examples for valid sequences:
aaabbb
ababab
aaabbbccc
abcabcabc
ababcabcc
Examples for invalid sequences (as a whole):
aaabbbc
aabbb
abbccc
abcabca
My thoughts so far:
Having at least 3 'c'
(?=([bc]*[a]){3})(?=([ac]*[b]){3})(?=([ab]*[c]){3,})[abc]
Combining this and above solution in a crude manner (regex101) which basically just a large "either none or at least 3"
((?=([bc]*[a]){3})(?=([ac]*[b]){3})(?=([ab]*[c]){3,})[abc] |(?=([b]*[a]){3})(?=([a]*[b]){3})[ab] )
Finally the Question: Is there a better way to achieve this using other methods, like or-ing the 'c'-requirement look-ahead, nested look-aheads or something entirely different?
CodePudding user response:
(?=^(?:.*a){3}.*$)(?=^(?:.*b){3}.*$)(?=^(?:.*c){3}.*$|^[^c]*$).*
Short Explanation
(?=^(?:.*a){3}.*$)
Assert that string contains at least 3a
(?=^(?:.*b){3}.*$)
Assert that string contains at least 3b
(?=^(?:.*c){3}.*$|^[^c]*$)
Assert that string contains at least 3c
or the string does not contain anyc
.*
Match the whole string that passes all assertions
Also, see the regex demo and Java example
CodePudding user response:
You could assert 3 times a
and 3 times b
, and then optionally match at least 3 times a c
Add anchors ^
and $
to assert the start and the end of the string.
Note that you don't have to put a single char like [a]
in a character class:
^(?=([bc]*a){3})(?=([ca]*b){3})[ab]*(?:c[ab]*c[ab]*c[abc]*)?$
Explanation
^
Start of string(?=([bc]*a){3})
Assert 3 times ana
char(?=([ca]*b){3})
Assert 3 times ab
char[ab]*
Match optional charsa
b
(?:
Non capture groupc[ab]*c[ab]*c
Match 3 times ac
char[abc]*
Match optionala
,b
andc
chars
)?
Close the non capture group and make it optional$
End of string
As you don't really need the capture groups, you can use non capture groups (?:
instead for the repetition:
^(?=(?:[bc]*a){3})(?=(?:[ca]*b){3})[ab]*(?:c[ab]*c[ab]*c[abc]*)?$
CodePudding user response:
You can use
(?<![abc]) # No "a", "b", "c" allowed immediately on the left
(?=(?:[bc]*a){3}) # At least three "a"s
(?=(?:[ac]*b){3}) # At least three "b"s
(?: # Either
(?=[ab]*(?![abc])) # only "a" or "b"s allowed until a location not followed with "a", "b" or "c"
| # or
(?=(?:[ab]*c){3}) # At least three "c"s
)
[abc] # Match and consume one or more "a", "b" or "c" chars
See the regex demo.
As a single line:
(?<![abc])(?=(?:[bc]*a){3})(?=(?:[ac]*b){3})(?:(?=[ab]*(?![abc]))|(?=(?:[ab]*c){3}))[abc]