Home > front end >  java regex tell which column not match
java regex tell which column not match

Time:01-18

Good day,

My java code is as follow:

Pattern p = Pattern.compile("^[a-zA-Z0-9$& ,:;=\\[\\]{}?@#|\\\\'<>._^*()%!/~\"`  -]*$");
String i = "f698fec0-dd89-11e8-b06b-☺";
Matcher tagmatch = p.matcher(i);
System.out.println("tagmatch is "   tagmatch.find());

As expected, the answer will be false, because there is ☺ character inside. However, I would like to show the column number that not match. For this example, it should show column 25th having the invalid character.

May I know how can I do this?

CodePudding user response:

You should remove anchors from your regex and then use Matcher#end() method to get the position where it stopped the previous match like this:

String i = "f698fec0-dd89-11e8-b06b-☺";
Pattern p = Pattern.compile("[\\w$& ,:;=\\[\\]{}?@#|\\\\'<>.^*()%!/~\"`  -] ");
Matcher m = p.matcher(i);
if (m.lookingAt() && i.length() > m.end()) { 
   System.out.println("Match <"   m.group()   "> failed at: "   m.end());
}

Output:

Match <f698fec0-dd89-11e8-b06b-> failed at: 24

PS: I have used lookingAt() to ensure that we match the pattern starting from the beginning of the region. You can use find() as well to get the next match anywhere or else keep the start anchor in pattern as

"^[\\w$& ,:;=\\[\\]{}?@#|\\\\'<>.^*()%!/~\"`  -] "

and use find() to effectively make it behave like the above code with lookingAt().

Read difference between lookingAt() and find()

I have refactored your regex to use \w instead of [a-zA-Z0-9_] and used quantifier (meaning match 1 or more) instead of * (meaning match 0 or more) to avoid returning success for zero-length matches.

  • Related