Home > Blockchain >  Using lookahead, how to ensure at least 4 alphanumeric chars are included underscores
Using lookahead, how to ensure at least 4 alphanumeric chars are included underscores

Time:09-30

I'm trying to make sure that at least 4 alphanumeric characters are included in the input, and that underscores are also allowed.

The regular-expressions tutorial is a bit over my head because it talks about assertions and success/failure if there is a match.

^\w*(?=[a-zA-Z0-9]{4})$

my understanding:

\w --> alphanumeric underscore

* --> matches the previous token between zero and unlimited times ( so, this means it can be any character that is alphanumeric/underscore, correct?)

(?=[a-zA-Z0-9]{4}) --> looks ahead of the previous characters, and if they include at least 4 alphanumeric characters, then I'm good.

Obviously I'm wrong on this, because regex101 is showing me no matches.

CodePudding user response:

You want 4 or more alphanumeric characters, surround by any number of underscores (use ^ and $ to ensure it match's the whole input ):

^(_*[a-zA-Z0-9]_*){4,}$

CodePudding user response:

I suggest using atomic groups (?>...), please see regex tutorial for details

 ^(?>_*[a-zA-Z0-9]_*){4,}$

to ensure 4 or more fragments each of them containing letter or digit.

Edit: If regex doesn't support atomic, let's try use just groups:

  ^(?:_*[A-Za-z0-9]_*){4,}$

CodePudding user response:

Your pattern ^\w*(?=[a-zA-Z0-9]{4})$ does not match because:

  • ^\w* Matches optional word characters from the start of the string, and if there are only word chars it will match until the end of the string
  • (?=[a-zA-Z0-9]{4}) The positive lookahead is true, if it can assert 4 consecutive alphanumeric chars to the right from the current position. The \w* allows backtracking, and can backtrack 4 positions so that the assertion it true.
  • But the $ asserts the end of the string, which it can not match as the position moved 4 steps to the left to fulfill the previous positive lookahead assertion.

Using the lookahead, what you can do is assert 4 alphanumeric chars preceded by optional underscores.

If the assertion is true, match 1 or more word characters.

^(?=(?:_*[a-zA-Z0-9]){4})\w $

The pattern matches:

  • ^ Start of string
  • (?= Positive lookahead, asser what is to the right is
    • (?:_*[a-zA-Z0-9]){4} Repeat 4 times matching optional _ followed by an alphanumeric char
  • ) Close the lookahead
  • \w Match 1 word characters (which includes the _)
  • $ End of string

Regex demo

  • Related