Actually the regex I have matches anything but the Chinese but it matches the numbers too, which I don't want. As you can see in the regex demo here, the number 45 is matched but I need it to be excluded too.
https://regex101.com/r/XNtD12/1
Current regex is: (?!\p{IsHan}\n)[^\p{IsHan}\n?。,?!]
Desired output:
He is 45 today <- matched 100%
你今天45岁了 <- not matched at all
这个句子没有数字 <- not matched at all
Ok I see <- matched 100%
Java code being used:
String example = "He is 45 today\n你今天45岁了\n这个句子没有数字\nOk I see";
System.out.println(example.replaceAll("^[^\\p{IsHan}\\n?。,?!] $", ""));
CodePudding user response:
In your pattern you can omit the lookahead (?!\p{IsHan}\n)
as the directly following negated character class already does not match \p{IsHan}
If you don't want partial matches, you can add anchors to the start and the end of the pattern, and enable multiline using an inline modifier (?m)
String example = "He is 45 today\n你今天45岁了\n这个句子没有数字\nOk I see";
System.out.println(example.replaceAll("(?m)^[^\\p{IsHan}\\n?。,?!] $", ""));
See a regex demo and a Java demo
If you want to remove optional trailing newlines using replaceAll:
^[^\\p{IsHan}\\n?。,?!] $\\R?