Home > Mobile >  Rust compiles method chain only when split to multiple statements
Rust compiles method chain only when split to multiple statements

Time:12-27

I was parsing some string input from a file when I came across this error. Normally it should not make a difference if you chain a series of methods on a single line or separate them into multiple operations. Yet here, it does not compile when the method chain is in a single line.

I do not get an error when split to multiple statements like so (link to playground)

let input = std::fs::read_to_string("tst_input.txt").expect("Failed to read input");
let input = input
    .lines()
    .map(|l| {
        let mut iter = l.split(" | ");
        (
            iter.next()
                .unwrap()
                .split_whitespace()
                .collect::<Vec<&str>>(),
            iter.next()
                .unwrap()
                .split_whitespace()
                .collect::<Vec<&str>>(),
        )
    })
    .collect::<Vec<_>>();

I get a lifetime error when it is in a single statement like so (link to playground)

let input = std::fs::read_to_string("tst_input.txt")
    .expect("Failed to read input")
    .lines()
    .map(|l| {
        let mut iter = l.split(" | ");
        (
            iter.next()
                .unwrap()
                .split_whitespace()
                .collect::<Vec<&str>>(),
            iter.next()
                .unwrap()
                .split_whitespace()
                .collect::<Vec<&str>>(),
        )
    })
    .collect::<Vec<_>>()
error[E0716]: temporary value dropped while borrowed
  --> src/main.rs:2:17
   |
2  |       let input = std::fs::read_to_string("tst_input.txt")
   |  _________________^
3  | |         .expect("Failed to read input")
   | |_______________________________________^ creates a temporary which is freed while still in use
...
18 |           .collect::<Vec<_>>();
   |                               - temporary value is freed at the end of this statement
19 |       println!("{:?}", input);
   |                        ----- borrow later used here
   |
   = note: consider using a `let` binding to create a longer lived value

Should these 2 cases be effectively identical? why is the compiler treating them differently? Could this possibly be a compiler error?

CodePudding user response:

These two cases are not identical, since the stored information differs.

In Rust, variables have semantic meaning: they act as a place where the information is stored, and, what's more important, they define when this information is destroyed - this is handled by Drop trait. By default, drop method is called for every variable which goes out of scope; this can be overridden with mem::forget and some other functions like Box::into_raw, but these are rather niche cases.

In the first case, the data being read is stored into the input variable of type String. This type wraps Vec<u8>, which implements Drop, so this data is deallocated when input goes out of scope. Then, the second input variable is of type Vec<(Vec<&str>, Vec<&str>)> - you can see that it contains a reference, so it is borrowing from the first input, so it must live no longer then the source string. Here, this is satisfied - of course, as long as you don't try to return this value up the stack, in which case the source string is dropped, and the references would dangle.

In the one-line version, however, the string is not stored anywhere - it is a temporary, which is destroyed right at the end of statement. That's why you're not allowed to hold any references to it. You can, however, make an owned version of the split data, by inserting an extra mapping operation:

let _: Vec<(Vec<String>, Vec<String>)> = std::fs::read_to_string("tst_input.txt")
    .expect("Failed to read input")
    // This iterator borrows from the temporary...
    .lines()
    .map(|l| {
        // ...this iterator reborrows that borrow...
        let mut iter = l.split(" | ");
        (
            iter.next()
                .unwrap()
                .split_whitespace()
                // ...and this operation clones the source data,
                // so they are copied to the new owned location,
                // and not referenced anymore, so can be freely dropped
                .map(str::to_owned)
                .collect::<Vec<_>>(),
            iter.next()
                .unwrap()
                .split_whitespace()
                .map(str::to_owned)
                .collect::<Vec<_>>(),
        )
    })
    .collect::<Vec<_>>();

CodePudding user response:

A minimal recreation of the issue might help

let split_value = String::from("example")// <- string owned value
    .split("x");
// string has no owner, so its lifetime ends
println!("{:?}", split_value); //error

A reference must not outlive the lifetime of the value it is referencing. Because the string is not being stored anywhere, and therefore has no owner, the value's lifetime ends.

And because split returns data that references that string value, its lifetime is linked to that string so it also ends.

By storing the result in a variable, the string now has a lifetime that lives past the expression.

let str_result = String::from("example"); //str_result owns the string value
let split_value = s.split("x");
println!("{:?}", r);

split_value can be printed because str_result's lifetime ends at the end of the function, therefore references to str_result are valid too.

  • Related