Home > Software design >  Rust regexes live long enough for match but not find
Rust regexes live long enough for match but not find

Time:10-17

I'm trying to understand why behavior for the match regex is different from the behavior for find, from documentation here.

I have the following for match:

use regex::Regex;
{
let meow = String::from("This is a long string that I am testing regexes on in rust.");
let re = Regex::new("I").unwrap();

let x = re.is_match(&meow);
dbg!(x)
}

And get:

[src/lib.rs:142] x = true

Great, now let's identify the location of the match:

{
let meow = String::from("This is a long string that I am testing regexes on in rust.");
let re = Regex::new("I").unwrap();

let x = re.find(&meow).unwrap();
dbg!(x)
}

And I get:

let x = re.find(&meow).unwrap();
                ^^^^^ borrowed value does not live long enough
}
^ `meow` dropped here while still borrowed
`meow` does not live long enough

I think I'm following the documentation. Why does the string meow live long enough for a match but not long enough for find?

CodePudding user response:

Writing a value without ; at the end of a { } scope effectively returns that value out of the scope. For example:

fn main() {
    let x = {
        let y = 10;
        y   1
    };

    dbg!(x);
}
[src/main.rs:7] x = 11

Here, because we don't write a ; after the y 1, it gets returned from the inner scope and written to x.

If you write a ; after it, you will get something different:

fn main() {
    let x = {
        let y = 10;
        y   1;
    };

    dbg!(x);
}
[src/main.rs:7] x = ()

Here you can see that the ; now prevents the value from being returned. Because no value gets returned from the inner scope, it implicitly gets the empty return type (), which gets stored in x.


The same happens in your code:

use regex::Regex;

fn main() {
    let z = {
        let meow = String::from("This is a long string that I am testing regexes on in rust.");
        let re = Regex::new("I").unwrap();

        let x = re.is_match(&meow);
        dbg!(x)
    };

    dbg!(z);
}
[src/main.rs:9] x = true
[src/main.rs:12] z = true

Because you don't write a ; after the dbg!() statement, its return value gets returned from the inner scope. The dbg!() statement simply returns the value that gets passed to it, so the return value of the inner scope is x. And because x is just a bool, it gets returned without a problem.

Now let's look at your second example:

use regex::Regex;

fn main() {
    let z = {
        let meow = String::from("This is a long string that I am testing regexes on in rust.");
        let re = Regex::new("I").unwrap();

        let x = re.find(&meow).unwrap();
        dbg!(x)
    };

    dbg!(z);
}
error[E0597]: `meow` does not live long enough
  --> src/main.rs:8:25
   |
4  |     let z = {
   |         - borrow later stored here
...
8  |         let x = re.find(&meow).unwrap();
   |                         ^^^^^ borrowed value does not live long enough
9  |         dbg!(x)
10 |     };
   |     - `meow` dropped here while still borrowed

And now it should be more obvious what's happening: It's basically the same as the previous example, just that the returned x is now a type that internally borrows meow. And because meow gets destroyed at the end of the scope, x cannot be returned, as it would outlive meow.

The reason why x borrows from meow is because regular expression Matches don't actually copy the data they matched, they just store a reference to it.

So if you add a ;, you prevent the value from being returned from the scope, changing the scope return value to ():

use regex::Regex;

fn main() {
    let z = {
        let meow = String::from("This is a long string that I am testing regexes on in rust.");
        let re = Regex::new("I").unwrap();

        let x = re.find(&meow).unwrap();
        dbg!(x);
    };

    dbg!(z);
}
[src/main.rs:9] x = Match {
    text: "This is a long string that I am testing regexes on in rust.",
    start: 27,
    end: 28,
}
[src/main.rs:12] z = ()
  • Related