I'm trying to make sure that at least 4 alphanumeric characters are included in the input, and that underscores are also allowed.
The regular-expressions tutorial is a bit over my head because it talks about assertions and success/failure if there is a match.
^\w*(?=[a-zA-Z0-9]{4})$
my understanding:
\w
--> alphanumeric underscore
*
--> matches the previous token between zero and unlimited times ( so, this means it can be any character that is alphanumeric/underscore, correct?)
(?=[a-zA-Z0-9]{4})
--> looks ahead of the previous characters, and if they include at least 4 alphanumeric characters, then I'm good.
Obviously I'm wrong on this, because regex101 is showing me no matches.
CodePudding user response:
You want 4 or more alphanumeric characters, surround by any number of underscores (use ^
and $
to ensure it match's the whole input ):
^(_*[a-zA-Z0-9]_*){4,}$
CodePudding user response:
I suggest using atomic groups (?>...)
, please see regex tutorial for details
^(?>_*[a-zA-Z0-9]_*){4,}$
to ensure 4 or more fragments each of them containing letter or digit.
Edit: If regex doesn't support atomic, let's try use just groups:
^(?:_*[A-Za-z0-9]_*){4,}$
CodePudding user response:
Your pattern ^\w*(?=[a-zA-Z0-9]{4})$
does not match because:
^\w*
Matches optional word characters from the start of the string, and if there are only word chars it will match until the end of the string(?=[a-zA-Z0-9]{4})
The positive lookahead is true, if it can assert 4 consecutive alphanumeric chars to the right from the current position. The\w*
allows backtracking, and can backtrack 4 positions so that the assertion it true.- But the
$
asserts the end of the string, which it can not match as the position moved 4 steps to the left to fulfill the previous positive lookahead assertion.
Using the lookahead, what you can do is assert 4 alphanumeric chars preceded by optional underscores.
If the assertion is true, match 1 or more word characters.
^(?=(?:_*[a-zA-Z0-9]){4})\w $
The pattern matches:
^
Start of string(?=
Positive lookahead, asser what is to the right is(?:_*[a-zA-Z0-9]){4}
Repeat 4 times matching optional_
followed by an alphanumeric char
)
Close the lookahead\w
Match 1 word characters (which includes the_
)$
End of string