Home > Back-end >  Interpreting escape characters in a string read from user input
Interpreting escape characters in a string read from user input

Time:06-12

I am writing a small CLI app in Rust. It processes user input, and if a special escape character is found, it needs to be interpreted.

For example, if user inputs a string 'New\nLine', it should be interpreted as:
New
Line

This is my code so far

fn main() {
    let text: Vec<String> = vec![String::from(r"New\n"), 
                                 String::from(r"Line")];
    slash_parser(text);
}


fn slash_parser(text: Vec<String>) {
    let mut text = text.join(" ");
    println!("{}", text); // ---> New\n Line
    if text.contains("\\") {
        text = str::replace(&text, r"\n", r"\\n");
        println!("{}", text); // ---> New\\n Line
        println!("New\nLine") // ---> New
                              // ---> Line
    }
}

I thought adding an extra \ to the string will make it be interpreted as new line, but I was clearly mistaken.

For some reason if a string in passed as an argument it's interpreted as a string without special characters.
But if string is printed as string literal, then the \n symbol is interpreted as new line.

What am I understanding here wrong, and how to make that \n in a string be interpreted as a new line?

CodePudding user response:

I think you misunderstand how escaping in string literals works.

The quick answer is:

  • text already has the newline character in it, as you construct the string wrong. I think you meant "New\\n" or r"New\n".
  • to replace '\' and 'n' with the newline character, do: text = str::replace(&text, "\\n", "\n");.

Background: String literals and escaping

The string literal "New\nLine" does not contain the characters '\' and 'n'. \n is a single character, the \ here is a so called escape character that allows for the creation of special characters. Like the newline character \n:

let s = "New\nLine";
println!("{:?}", s.chars().collect::<Vec<_>>());
println!("{}", s);
['N', 'e', 'w', '\n', 'L', 'i', 'n', 'e']
New
Line

But if the user enters it, it most definitely is the two characters \ and n. To create a string literal that actually contains those two characters, you have to escape the \ as otherwise \n would be interpreted as a single character:

let s = "New\\nLine";
println!("{:?}", s.chars().collect::<Vec<_>>());
println!("{}", s);
['N', 'e', 'w', '\\', 'n', 'L', 'i', 'n', 'e']
New\nLine

Note that the \ character is here shown as \\. In representations that allow escaping, \\ is used for representing an actual \ character.

Alternatively, you could create a raw string literal which does not perform escaping; meaning, what you see is actually what you get:

let s = r#"New\nLine"#;
println!("{:?}", s.chars().collect::<Vec<_>>());
println!("{}", s);
['N', 'e', 'w', '\\', 'n', 'L', 'i', 'n', 'e']
New\nLine

The actual question: interpreting user strings

So if we get a user string, it most certainly does not contain the \n character, but instead the two characters \ and n. So let's use the raw string r#"New\nLine" to represent the user input.

Now we can simply replace the '\' and 'n' characters with the special character '\n':

fn main() {
    let s = r#"New\nLine"#;

    let s_unescaped = s.replace("\\n", "\n");

    println!("User input:\n{}\n", s);
    println!("Unescaped:\n{}", s_unescaped);
}
User input:
New\nLine

Unescaped:
New
Line

With .replace("\\n", "\n"), we replace the string "\\n", which is the characters \ and n, with "\n", which is the newline character.

Of course instead of "\\n" we could write the raw string r"\n":

let s_unescaped = s.replace(r"\n", "\n");

Additional remarks

A lot of confusion also arises from the fact that Display ({}) and Debug ({:?}) print strings differently. While {} prints it in its unescaped form (meaning newlines actually get printed as line breaks), the {:?} escapes the string before printing it:

fn main() {
    let s = "New\nLine";
    println!("{}", s);
    println!("{:?}", s);

    println!("---");

    let s = "New\\nLine";
    println!("{}", s);
    println!("{:?}", s);
}
New
Line
"New\nLine"
---
New\nLine
"New\\nLine"
  • Related