I'm working to familiarize myself with regex in Java, and it's a bit of a learning curve for me.
So the solution to the problem is as follows:
public static String validate (String ip){
return ip.replaceAll("(?<=^|\\.)0 (?!\\.|$)", "");
}
I just need clarification on why this regex solution works.
I get that the "?" specifies zero or one instances, the "^" represents the beginning of a string, the "\." is the escape character for a period, the "$" represents the end of the line, and the "" at the end represents deletion of a character, but I don't understand the totality of the regex. If someone could just quick walk me through what it all means together, that would be greatly appreciated. Thank you!
CodePudding user response:
In my opinion, this regex is a tricky one, because it uses a pattern not very common : lookahead and lookbehind patterns.
I get that the "?" specifies zero or one instances
That is true, but only when it follows a character to search for.
Here, it follows no character, as it is the first character of a group opening. It means it is a special construct. There's a dedicated javadoc subsection for it (see [1]).
In your examples, we can find two different constructs:
(?<=^|\\.)
- This is cited in [1] as
(?<=X) X, via zero-width positive lookbehind
- What is this "positive lookbehind" stuff ? Source [2] defines it as
Asserts that what immediately precedes the current position in the string is X
- In your case, we ask to verify that 0 is either (
|
) the first character of the input text (^
), or is just after a.
- This is cited in [1] as
(?!\\.|$)
- [1] define it as
(?!X) X, via zero-width negative lookahead
- [2] explains:
Asserts that what immediately follows the current position in the string is not X
- In your context, it ensures that we don't match trailing zero of a number (either the last zero or a zero that is just before a dot).
- [1] define it as