Why do &str arrays in Rust passed as parameters have different lifetimes?-CodePudding

I am in the process of learning Rust and was testing some array copying through a function. I am sure there are built-in Rust functions to copy/clone array information, but a personal implementation I thought would be a good idea to help my understanding of passing references through functions.

fn copy_str_arr_original (a1: [&str; 60], a2: &mut [&str; 60]) {
    // copy 1 into 2
    for i in 0..60 {
        a2[i] = a1[i];
    } // change is reflected in a2 as it is passed as &mut
}

However, this threw the error these two types are declared with different lifetimes... for the &str types themselves. After some further studying, I tried declaring my own lifetime and assigning them to it, and that fixed it!

fn copy_str_arr_fix<'a> (a1: [&'a str; 60], a2: &mut [&'a str; 60]) {
    // copy 1 into 2
    for i in 0..60 {
        a2[i] = a1[i];
    } // change is reflected in a2 as it is passed as &mut
}

Why is this the case, though? Why does the type of values within the array need to have a lifetime assigned instead of the parameters themselves? In other words, why does this not work at all?

fn copy_str_arr_bad<'a> (a1: &'a [&str; 60], a2: &'a mut [&str; 60]) {
    // does not work...           ^-----------------------^-------- different lifetimes
    for i in 0..60 {
        a2[i] = a1[i]; 
    } 
}

I am still struggling to get the hang of how lifetimes work in the context of more complex objects such as arrays and structs, so any explanation would be greatly appreciated!

CodePudding user response：

The error message is a bit confusing because it refers to lifetimes generated as per rules of lifetime elision. In your case, lifetime elision means that:

fn copy_str_arr_original(a1: [&str; 60], a2: &mut [&str; 60])

is syntactic sugar for:

fn copy_str_arr_original<'a1, 'a2_mut, 'a2>(a1: [&'a1 str; 60], a2: &'a2_mut mut [&'a2 str; 60])

In other words, we have three completely unrelated lifetimes. "Unrelated" means that the caller gets to choose how long the objects they're associated with live. For example, the strings in a2 might be static and live until the end of the program, while the strings in a1 might get dropped immediately after copy_str_arr_original() returns. Or the other way around. If that amount of freedom seems like it could cause problems, you're on the right track because the borrow checker agrees with you.

Note that, somewhat counter-intuitively, the length of the 'a2_mut lifetime is completely irrelevant, it can be as long or as short as the caller likes. Our function has received the reference and can therefore use it during the function's scope. 'a2_mut lifetime tells us how long it will live outside the scope of the function, and we just don't care about that.

'a1 and 'a2 are another matter. Since we're copying references from a1 to a2, we are effectively casting the references inside a1 (of type &'a1 str) to the type of references stored in a2 (which is &'a2 str):

a2[i] = a1[i];  // implicitly casts &'a1 str to &'a2 str

For that to be valid, &'a1 str must be a subtype of &'a2 str. While Rust doesn't have classes and subclassing in the C sense, it does have subtypes where lifetimes are concerned. In that sense, A is a subtype of B if values of A values are guaranteed to live at least as long as values of B. In other words, 'a1 must leave at least as long as 'a2, which is expressed as 'a1: a2. So this compiles:

fn copy_str_arr<'a1: 'a2, 'a2, 'a2_mut>(a1: [&'a1 str; 60], a2: &'a2_mut mut [&'a2 str; 60]) {
    for i in 0..60 {
        a2[i] = a1[i];
    }
}

Another way for the cast to succeed is to just require the lifetime to be the same, which you effectively did in your code. (You also omitted the 'a2_mut lifetime, which compiler correctly interpreted as a request for an unrelated anonymous lifetime.)

CodePudding user response：

Let's assume that you can define copy_str_arr with two different, unrelated lifetimes, like this:

fn copy_str_arr<'a, 'b>(a1: [&'a str; 60], a2: &mut [&'b str; 60]) {
    // ...
}

Then consider this example:

let mut outer: [&str; 60] = [""; 60];

{
    let temp_string = String::from("temporary string");
    
    let inner: [&str; 60] = [&temp_string; 60];

    // this compiles because our bad `copy_str_arr` function allows
    // `inner` and `outer` to have unrelated lifetimes
    copy_str_array(&inner, &mut outer); 

}   // <-- `temp_string` destroyed here

// now `outer` contains references to `temp_string` here, which is invalid
// because it has already been destroyed!

println!("{:?}", outer); // undefined behavior! may print garbage, crash your
                         // program, make your computer catch fire or anything else

As you can see, if a1 and a2 are allowed to have completely unrelated lifetimes, then we can end up in a situation where one of the arrays holds references to invalid data, which is very bad.

However, the lifetimes do not have to be the same. You can instead require that the lifetime you are copying from outlives the lifetime you are copying to (thus ensuring that you're not illegally extending the lifetime of a reference):

fn copy_str_arr<'a, 'b>(a1: &[&'a str; 60], a2: &mut [&'b str; 60])
where
    'a: 'b, // 'a (source) outlives 'b (destination)
{
    for i in 0..60 {
        a2[i] = a1[i];
    }
}

CodePudding user response：

The simple answer is that the compiler is not very smart.

The fact that you do not have to specify a bunch of lifetimes every time you define a function which handles references is only because the compiler takes a few educated guesses if it can. So it is a little bit smart, but not very.

Say you are writing a function that takes a reference to a struct and returns a reference to a field in that struct:

struct Book {
  pages: u16,
  title: String,
}

fn borrow_title(book: &Book) -> &str {
  &book.title
}

Nine times out of ten it is indeed a reference to the argument you passed. But sometimes it is not:

fn borrow_title(book: &Book) -> &'static str {
  if book.pages > 10 {
    "Too long..."
  } else {
    "Not long enough"
  }
}

As you can see you would need to specify that the returned &str has a different lifetime (in this case the special 'static.

So since you say fn copy_str_arr_original (a1: [&str; 60], a2: &mut [&str; 60]), the compiler does not actually reason about your implementation and does not know that the lifetime of references in a1 should be at least as long as the lifetime of any reference in a2.

As for the second part, you need to consider that a reference is just a pointer to some data. That data can contain other references. In this case it is these other references that are important.

You have 2 arrays of string references here. Say you copy the references from the first one to the second one. Whether you pass these arrays into the function by reference or not is not important. What is important is that if whatever holds the ownership of the first array was dropped, the strings would be too. And if the second array still held any references, this would result in unsafe memory handling.

To simplify, let's consider that there is only one string and we are going to borrow values into an array and then copy those borrowed values to another array, drop the first array and then drop the string. What would you expect to happen?

The compiler will throw a fit in order to ensure that no references to the string remain.