Home > Mobile >  Regex that does not accept sub strings of more than two 'b'
Regex that does not accept sub strings of more than two 'b'

Time:01-03

I need a regex that accepts all the strings consisting only of characters a and b, except those with more than two 'b' in a row.

For example, these should not match:

abb
ababbb
bba
bbbaa
bbb
bb

I came up with this, but it's not working

[a-b] b{2,}[a-b]*

Here is my code:

int main() {
    string input;
    regex validator_regex("\b(?:b(?:a b?)*|(?:a b?) )\b");
    
    cout << "Hello, "<<endl;
    while(regex_match(input,validator_regex)==false){
        cout << "please enter your choice of regEx :"<<endl;
        cin>>input;
        if(regex_match(input,validator_regex)==false)
            cout<<input " is not a valid input"<<endl;
        else
            cout<<input " is  valid "<<endl;
    }
}

CodePudding user response:

Your pattern [a-b] b{2,}[a-b]* matches 1 or more a or b chars until you match bb which is what you don't want. Also note that the string should be at least 3 characters long due to this part [a-b] b{2,}


To not match 2 b chars in a row you can exclude those matches using a negative lookahead by matching optional chars a or b until you encounter bb

Note that [a-b] is the same as [ab]

\b(?![ab]*?bb)[ab] \b
  • \b A word boundary
  • (?![ab]*?bb) Negative lookahead, assert not 0 times a or b followed by bb to the right
  • [ab] Match 1 occurrences of a or b
  • \b A word boundary

Regex demo


Without using lookarounds, you can match the strings that you don't want by matching a string that contains bb, and capture in group 1 the strings that you want to keep:

\b[ab]*bb[ab]*\b|\b([ab] )\b

Regex demo


Or use an alternation matching either starting with b and optional repetitions of 1 a chars followed by an optional b, or match 1 repetitions of starting with a followed by an optional b

\b(?:b(?:a b?)*|(?:a b?) )\b

Regex demo

CodePudding user response:

The simplest regex is:

^(?!.*bb)[ab] $

See live demo.

This regex works by adding a negative look ahead (anchored to start) for bb appearing anywhere within input consisting of a or b.

If zero length input should match, change [ab] to [ab]*.

  • Related