Home > other >  Regex does not check the first character after first check
Regex does not check the first character after first check

Time:10-25

I am trying to write a regex that:

  1. Allows only numbers, lowercase letters and also "-" and "_".
  2. String can only start with: letter number or "uuid:"
  3. String must have at least one letter in it.
  4. It must consist of at least 2 characters.

I managed to create such a regex: \A(?:uuid:|[a-z0-9])(?=(.*[a-z])){1,}(?:\w|-) \z I just don't understand why if the first character is a letter, it is not taken into account, so it doesn't pass for example: a1. And also why it allows uppercase letters AA.

Tests: https://rubular.com/r/Q5gEP15iaYkHYQ

Thank you in advance for your help

CodePudding user response:

You could also get the matches without lookarounds using an alternation matching at least 2 characters from the start.

If you don't want to match uppercase chars A-Z, then you can omit /i for case insensitive matching.

\A(?:uuid:|[a-z][a-z0-9_-]|[0-9][0-9_-]*[a-z])[a-z0-9_-]*\z

Explanation

  • \A Start of string
  • (?: Non capture group
    • uuid: match literally
    • | Or
    • [a-z][a-z0-9_-] match a char a-z and one of a-z 0-9 _ -
    • | Or
    • [0-9][0-9_-]*[a-z] Match a digit, optional chars 0-9 _ - and then a-z
  • ) Close non capture group
  • [a-z0-9_-]* Match optional chars a-z 0-9 _ -
  • \z End of string

Regex rubular demo

CodePudding user response:

It looks like AA meets all your requirements: it contains a letter, at least two chars, and starts with a letter, and contains "only numbers, lowercase letters and also - and _". NOTE you have an i flag that makes pattern matching case insensitive, and if you do not want to allow any uppercase letters, just remove it from the end of the regex literal.

To fix the other real issues, you can use

/\A(?=[^a-z]*[a-z])(?=.{2})(?:uuid:|[a-z0-9])[a-z0-9_-]*\z/

See this Rubular demo.

Note that in the demo, (?=[^a-z\n]*[a-z]) is used rather than (?=[^a-z]*[a-z]) because the test is performed on a single multiline string, not an array of separate strings.

Details:

  • \A - start of string
  • (?=[^a-z]*[a-z]) - minimum one letter
  • (?=.{2}) - minimim two chars
  • (?:uuid:|[a-z0-9]) - uuid:, or one letter or digit
  • [a-z0-9_-]* - zero or more letters, digits, _ or -
  • \z - end of string
  • Related