I need to separate a sentence using (.) However, I came across numbers. How can I define a split(.) excluding points that are between numbers?
Example:
"I paid 1.000 dollars. Very expensive. But I think today it should be cheaper."
I got this:
I paid 1.
000 dollars.
Very expensive.
But I think today it should be cheaper.
But I need this:
I paid 1.000 dollars.
Very expensive.
But I think today it should be cheaper.
CodePudding user response:
Using the regex from this answer, you can do the following:
public static String[] split(String str) {
return str.split("[\\.\\!] (?!\\d)\\s*|\\n \\s*");
}
The result:
I paid 1.000 dollars
Very expensive
But I think today it should be cheaper
CodePudding user response:
Just use negativa lookarounds:
String textToParse = "I paid 1.000 dollars. Very expensive. But I think today it should be cheaper.";
String[] chunks = textToParse.split("(?<!\\d)\\.(?!\\d)");
for(int i = 0; i < chunks.length; i ){
System.out.println(chunks[i].trim());
}
Explanation:
i used negativa lookahead, which asserts that what follows is not matching pattern specified, so (?!\d)
assuers that we will match, if text is NOT followed by any digit \d
.
I also used negativa loookbehind, but it's totally equivalent to above, but just look what preceeds the text, not what follows. So in a same manner, we just assure what is before is not a digit.