I am seeking for a regexp that finds a specific line break \n
from a long String.
The specific \n
is the one before a line that do not contains a specific char: '#'
As example:
This tis a fine #line1\nThis tis another fine #line2\nThis_belongs_to abobe line\nThis tis still is OK #line4
that represents the text:
this tis a fine #line1
this tis another fine #line2
this_belongs_to abobe line
this tis still is OK #line4
here the \n
to be removed in the one after #line2, resulting in the text:
this tis a fine #line1
this tis another fine #line2this_belongs_to abobe line
this tis still is OK #line4
I came up with a regexp like: \n^(?m)(?!.*#).*$
that is close, but I can't figure out how to build the right one that allows me to match and remove only the right line break and preserve the remaining text/String.
Perhaps there is a better way than using regular expression?
CodePudding user response:
You can use
text = text.replaceAll("\\R(?!.*#)", "");
text = text.replaceAll("(?m)\\R(?=[^\n#] $)", "");
See the regex demo / regex demo #2. Details:
(?m)
-Pattern.MULTILINE
embedded flag option to make$
in this pattern match end of a line, not the end of the whole string\R
- any line break sequence(?!.*#)
- a negative lookahead that matches a location not immediately followed with any zero or more chars other than line break chars as many as possible and then a#
char(?=[^\n#] $)
- a positive lookahead that requires one or more chars (replace*
to match an empty line, too) other than an LF and#
up to an end of a line.
See the Java demo online:
String s_lf = "this tis a fine #line1\nthis tis another fine #line2\nthis_belongs_to abobe line\nthis tis still is OK #line4";
String s_crlf = "this tis a fine #line1\r\nthis tis another fine #line2\r\nthis_belongs_to abobe line\r\nthis tis still is OK #line4";
System.out.println(s_lf.replaceAll("\\R(?!.*#)", ""));
System.out.println(s_crlf.replaceAll("\\R(?!.*#)", ""));
System.out.println(s_lf.replaceAll("(?m)\\R(?=[^\n#] $)", ""));
System.out.println(s_crlf.replaceAll("(?m)\\R(?=[^\n#] $)", ""));
All test cases - with strings having CRLF and LF line endings - result in
this tis a fine #line1
this tis another fine #line2this_belongs_to abobe line
this tis still is OK #line4