I can't find a solution to this very simple problem: I would like to replace a tab in a String with a whitespace (and only one space). For example, I have a String like this:
Hello World!
New line"
And I would like to get this as a result:
Hello World!
New line
For this, I used this function :
myStr.replaceAll("\\s ", " ");
The tabs are well removed... But also the carriage return:
Hello World! New line
I also tried to use replaceAll with "[\\t ]" as replacement characters but if I replace with a whitespace, it does not change anything..
I must be missing a simple solution but I don't see...
CodePudding user response:
You need to use the following to match multiple contiguous tabs and spaces.
[\\t ]
For regular expressions it's always a good idea to test them out using a tool like https://regexr.com/
There you can enter your sample and the regular expression and it even explains what's going on.
CodePudding user response:
is any character or series of characters that represent horizontal or vertical space in typography. When
Issue
In you example the \s
matched and replaced all of the following:
- regular space like
- tab like
\t
(horizontal) - carriage-return like
\r
(vertical) - new-line or line-feed like
\n
(vertical)
See this substitution demo for Java's regex-flavor.
Alternative Solutions
In Java you could easily condense this horizontal whitespace with:
(1) Split by lines and clean each line separately
See the demo on IdeOne:
String multiLineText = "\tHello World!" "\n"
"New line";
String lineSeparatorRegex = "\r?\n"; // usually "\n" on Linux/Mac or "\r\n" on Windows
List<String> condensedLines = new ArrayList();
String[] lines = multiLineText.split(System.lineSeparator()); // alternative: use the regex
for (String line : lines) {
condensedLines.add(line.replaceAll("\\s ", " ")); // condense
}
String condensedPerLine = String.join(System.lineSeparator(), condensedLines);
Note: System.getProperty("line.separator")
is the old way before System.lineSeparator()
was introduced in Java 1.7
(2) Simple multi-line capable regex
as answered by Niko:
// remove all tabs or additional space characters
String condensedPerLine = multiLineText.replaceAll("[\t ] ", " ");
See on Regex101: demo preserving lines.
(3) Use Apache StringUtils with streaming:
StringUtils
class is perfect for handling Strings null-safe, for this case normalizeWhitespace(s)
.
Note there in JavaDocs also the hint:
Java's regexp pattern \s defines whitespace as [ \t\n\x0B\f\r]
// clean all superfluous whitespace and control-characters from lines
String condensedPerLine = Arrays.stream(multiLineText.split(System.lineSeparator())
.map( s -> return StringUtils.normalizeWhitespace(s))
.collect(Collectors.joining(System.lineSepartor()));