I am writing a small CLI app in Rust. It processes user input, and if a special escape character is found, it needs to be interpreted.
For example, if user inputs a string 'New\nLine'
, it should be interpreted as:
New
Line
This is my code so far
fn main() {
let text: Vec<String> = vec![String::from(r"New\n"),
String::from(r"Line")];
slash_parser(text);
}
fn slash_parser(text: Vec<String>) {
let mut text = text.join(" ");
println!("{}", text); // ---> New\n Line
if text.contains("\\") {
text = str::replace(&text, r"\n", r"\\n");
println!("{}", text); // ---> New\\n Line
println!("New\nLine") // ---> New
// ---> Line
}
}
I thought adding an extra \
to the string will make it be interpreted as new line, but I was clearly mistaken.
For some reason if a string in passed as an argument it's interpreted as a string without special characters.
But if string is printed as string literal, then the \n
symbol is interpreted as new line.
What am I understanding here wrong, and how to make that \n
in a string be interpreted as a new line?
CodePudding user response:
I think you misunderstand how escaping in string literals works.
The quick answer is:
text
already has the newline character in it, as you construct the string wrong. I think you meant"New\\n"
orr"New\n"
.- to replace
'\'
and'n'
with the newline character, do:text = str::replace(&text, "\\n", "\n");
.
Background: String literals and escaping
The string literal "New\nLine"
does not contain the characters '\'
and 'n'
. \n
is a single character, the \
here is a so called escape character that allows for the creation of special characters. Like the newline character \n
:
let s = "New\nLine";
println!("{:?}", s.chars().collect::<Vec<_>>());
println!("{}", s);
['N', 'e', 'w', '\n', 'L', 'i', 'n', 'e']
New
Line
But if the user enters it, it most definitely is the two characters \
and n
. To create a string literal that actually contains those two characters, you have to escape the \
as otherwise \n
would be interpreted as a single character:
let s = "New\\nLine";
println!("{:?}", s.chars().collect::<Vec<_>>());
println!("{}", s);
['N', 'e', 'w', '\\', 'n', 'L', 'i', 'n', 'e']
New\nLine
Note that the \
character is here shown as \\
. In representations that allow escaping, \\
is used for representing an actual \
character.
Alternatively, you could create a raw string literal which does not perform escaping; meaning, what you see is actually what you get:
let s = r#"New\nLine"#;
println!("{:?}", s.chars().collect::<Vec<_>>());
println!("{}", s);
['N', 'e', 'w', '\\', 'n', 'L', 'i', 'n', 'e']
New\nLine
The actual question: interpreting user strings
So if we get a user string, it most certainly does not contain the \n
character, but instead the two characters \
and n
. So let's use the raw string r#"New\nLine"
to represent the user input.
Now we can simply replace the '\'
and 'n'
characters with the special character '\n'
:
fn main() {
let s = r#"New\nLine"#;
let s_unescaped = s.replace("\\n", "\n");
println!("User input:\n{}\n", s);
println!("Unescaped:\n{}", s_unescaped);
}
User input:
New\nLine
Unescaped:
New
Line
With .replace("\\n", "\n")
, we replace the string "\\n"
, which is the characters \
and n
, with "\n"
, which is the newline character.
Of course instead of "\\n"
we could write the raw string r"\n"
:
let s_unescaped = s.replace(r"\n", "\n");
Additional remarks
A lot of confusion also arises from the fact that Display
({}
) and Debug
({:?}
) print strings differently. While {}
prints it in its unescaped form (meaning newlines actually get printed as line breaks), the {:?}
escapes the string before printing it:
fn main() {
let s = "New\nLine";
println!("{}", s);
println!("{:?}", s);
println!("---");
let s = "New\\nLine";
println!("{}", s);
println!("{:?}", s);
}
New
Line
"New\nLine"
---
New\nLine
"New\\nLine"