I would like to know if the following constraint can be checked with regex: "Must be at least 5 characters, of which 4 should be letters"
I know how to put the Must be at least 5 characters
constraint, but not sure of of which 4 should be letters
if it's even possible with regex.
CodePudding user response:
Yes, it is possible. Please use the following regex.
(?=.{5,})\w*[a-z]\w*[a-z]\w*[a-z]\w*[a-z]\w*
Explanation
(?=
Lookahead assertion - assert that the following regex matches.
Any character{5,}
Not less than 5 repetitions
)
Close lookahead\w*[a-z]\w*[a-z]\w*[a-z]\w*[a-z]\w*
Static letters from a to z in any order
NOTE: The (?=.{5,})
asserts that the string match 5 or more characters
CodePudding user response:
You can also use this pattern:
(?i)(?=.*[a-z].*[a-z].*[a-z].*[a-z].*).{5,}
Here, the positive lookahead (?=.*[a-z].*[a-z].*[a-z].*[a-z].*)
asserts that there must be four letters (case does not play a role given (?i)
) either directly or indirectly following each other. Once that condition is met the regex matches any string that is at least 5 characters long
CodePudding user response:
What language or tool are you using?
This sounds like one of those things that doesn't need to be a single regex.
Here's "at least four letters"
[a-z].*[a-z].*[a-z].*[a-z]
and here's "at least five characters"
.{5,}
or even, if you're in a language like PHP, avoid regexes entirely and be more explicitly clear:
length($str) >= 5
CodePudding user response:
You can even do this without lookahead! Consider the following RegEx:
(. [a-z].*[a-z].*[a-z].*[a-z].*)|(.*[a-z]. [a-z].*[a-z].*[a-z].*)|(.*[a-z].*[a-z].*[a-z]. [a-z].*)|(.*[a-z].*[a-z].*[a-z]. [a-z].*)|(.*[a-z].*[a-z].*[a-z].*[a-z]. )
Depending on your engine you may have to anchor this using ^
and $
.
Generation: Simply shifted the
quantifier all the way through: The four letters are a must, but the fifth letter can be at any position.
If possible, you should avoid using RegEx for this though, or combine a RegEx that checks whether four letters are present (.*[a-z].*[a-z].*[a-z].*[a-z].*
) with a simple length check.
If you need exactly 5 characters to be letters, replace .
with [^a-z]
.
If you can use regular grammars, this can be written way shorter:
S →
%a
A |.
S'
S' →%a
A' |.
S'
A →%a
B |.
A'
A' →%a
B' |.
A'
B →%a
C |.
B'
B' →%a
C |.
B'
C →%a
D |.
C'
C' →%a
D' |.
C'
D →.
D'
D' →ε
where S
is the start symbol, .
stands for any character and %a
for any letter. Five states are needed to keep track of how many characters have been read; each state X also needs a state X' to keep track of whether a non-letter character has been read yet.