Home > Back-end >  Match string with a limit of consecutive uppercase characters
Match string with a limit of consecutive uppercase characters

Time:03-03

I need a regex that matches strings with any non-whitespace characters, where only a maximum of 3 consecutive uppercase letters are allowed. I'm not experienced enough to find that out on my own.

A list of example strings that should match:

a
aa
aaa
aaaa
A
aA
Aa
AA
aAA
AAa
aAAa
aaAAaa
AAaAAa
AaAaA

Strings that must not match:

AAA
aAAA
AAAa
aAAAa
aaaAAaAAA

a and A stands for any characters that are part of normal words (not , or or something like that).

As interesting as it is, it is to tricky for me. I even don't know how to start.

Update: @zerOOne pointed me to this answer.

/\b\p{L}*\p{Lu}{3}\p{L}*\b/u

seems to be exactly the opposite of what I want. Tried to negate that with

/(?!\b\p{L}*\p{Lu}{3}\p{L}*\b)/u

But that doesn't work. How could I else negate the regex?

CodePudding user response:

If it's a string with non-whitespace and max 2 consecutive uppercase letters, then this regex will do :

^(?!.*\p{Lu}{3})\S $

^ : start of line or string
(?!.*\p{Lu}{3}) : negative lookahead to skip strings with at least 3 consecutive uppercase letters
\S : one or more non-whitespace characters
$ : end of line or string

To only match strings that only contain letters simply replace the \S by \p{L}

\p{L} : any kind of letter from any language.

^(?!.*\p{Lu}{3})\p{L} $

CodePudding user response:

Two solutions seems to be possible.

Taken from the comments above (credits to @WiktorStribiżew):

^(?!.*\b\p{L}*\p{Lu}{3}\p{L}*\b).*

Found that on my own (and not 100% sure about that - but it seems to do it):

^((?!\p{Lu}{3})\p{L})*$

People told me that this is a duplicate question. I leave the answer here for anyone, though. No need to upvote.

  • Related