Home > Mobile >  Was there any difference between using regex in java and regex in javascript?
Was there any difference between using regex in java and regex in javascript?

Time:11-17

I have a requirement to build a regex pattern to validate a String in Java. Hence I build a pattern [A-Z][a-z]*\s?[A-Z]?[a-z]*$ for the conditions:

  • Should start with caps
  • Every other Word should start with caps
  • No numbers included
  • no consecutive two spaces allowed

Pattern.matches("[A-Z][a-z]*\s?[A-Z]?[a-z]*$","Joe V") returns false for me in java. But the same pattern returns true for the data "Joe V" in regexr.com.

What might be the cause?

CodePudding user response:

Javascript has native support for regex while Java doesn't. Since Java uses \ for special signs in strings (like \n) you have to escape the \ to actually be a \ sign. That's done with another \. So any \ you use in Java should be written as \\.

Thus your regex / code should be:

Pattern.matches("[A-Z][a-z]*\\s?[A-Z]?[a-z]*$", "Joe V")

which returns true

P.s. \s is interpreted as a Space in any Java-String

CodePudding user response:

You can use

Pattern.matches("[A-Z][a-z]*(?:\\s[A-Z][a-z]*)*","Joe V")
Pattern.matches("\\p{Lu}\\p{Ll}*(?:\\s\\p{Lu}\\p{Ll}*)*","Joe V")

See the regex demo #1 and regex demo #2.

Note that .matches requires a full string match, hence the use of ^ and $ anchors on the testing site and their absence in the code.

Details:

  • ^ - start of string (implied in .matches)
  • [A-Z] / \p{Lu} - an (Unicode) uppercase letter
  • [a-z]* / \p{Ll}* - zero or more (Unicode) lowercase letters
  • (?:\s[A-Z][a-z]*)* / (?:\s\p{Lu}\p{Ll}*)* - zero or more sequences of
    • \s - one whitespace
    • [A-Z][a-z]* /\p{Lu}\p{Ll}* - an uppercase (Unicode) letter and then zero or more (Unicode) lowercase letters.
  • $ - end of string (implied in .matches)
  • Related