I have a requirement to build a regex pattern to validate a String in Java. Hence I build a pattern
[A-Z][a-z]*\s?[A-Z]?[a-z]*$
for the conditions:
- Should start with caps
- Every other Word should start with caps
- No numbers included
- no consecutive two spaces allowed
Pattern.matches("[A-Z][a-z]*\s?[A-Z]?[a-z]*$","Joe V")
returns false
for me in java.
But the same pattern returns true for the data "Joe V" in regexr.com.
What might be the cause?
CodePudding user response:
Javascript has native support for regex while Java doesn't. Since Java uses \
for special signs in strings (like \n
) you have to escape the \
to actually be a \
sign. That's done with another \
. So any \
you use in Java should be written as \\
.
Thus your regex / code should be:
Pattern.matches("[A-Z][a-z]*\\s?[A-Z]?[a-z]*$", "Joe V")
which returns true
P.s. \s
is interpreted as a Space in any Java-String
CodePudding user response:
You can use
Pattern.matches("[A-Z][a-z]*(?:\\s[A-Z][a-z]*)*","Joe V")
Pattern.matches("\\p{Lu}\\p{Ll}*(?:\\s\\p{Lu}\\p{Ll}*)*","Joe V")
See the regex demo #1 and regex demo #2.
Note that .matches
requires a full string match, hence the use of ^
and $
anchors on the testing site and their absence in the code.
Details:
^
- start of string (implied in.matches
)[A-Z]
/\p{Lu}
- an (Unicode) uppercase letter[a-z]*
/\p{Ll}*
- zero or more (Unicode) lowercase letters(?:\s[A-Z][a-z]*)*
/(?:\s\p{Lu}\p{Ll}*)*
- zero or more sequences of\s
- one whitespace[A-Z][a-z]*
/\p{Lu}\p{Ll}*
- an uppercase (Unicode) letter and then zero or more (Unicode) lowercase letters.
$
- end of string (implied in.matches
)