Home > OS >  Java regex splitting, but only removing one whitespace
Java regex splitting, but only removing one whitespace

Time:12-01

I have this code:

String[] parts = sentence.split("\\s");

and a sentence like: "this is a whitespace and I want to split it" (note there are 3 whitespaces after "whitespace")

I want to split it in a way, where only the last whitespace will be removed, keeping the original message intact. The output should be

"[this], [is], [a], [whitespace ], [and], [I], [want], [to], [split], [it]" (two whitespaces after the word "whitespace")

Can I do this with regex and if not, is there even a way?

I removed the from \\s to only remove one whitespace

CodePudding user response:

You can use

String[] parts = sentence.split("\\s(?=\\S)");

That will split with a whitespace char that is immediately followed with a non-whitespace char.

See the regex demo. Details:

  • \s - a whitespace char
  • (?=\S) - a positive lookahead that requires a non-whitespace char to appear immediately to the right of the current location.

To make it fully Unicode-aware in Java, add the (?U) (Pattern.UNICODE_CHARACTER_CLASS option equivalent) embedded flag option: .split("(?U)\\s(?=\\S)").

  • Related