Home > OS >  Lifetimes of thread::scope()'s variables and spawned threads
Lifetimes of thread::scope()'s variables and spawned threads

Time:01-27

I'm trying to do some parallel processing on a list of values:

fn process_list(list: Vec<f32>) -> Vec<f32> { // Block F
    let chunk_size = 100;
    let output_list = vec![0.0f32;list.len()];

    thread::scope(|s| { // Block S
        (0..list.len()).collect::<Vec<_>>().chunks(chunk_size).for_each(|chunk| { // Block T
            s.spawn(|| {
                chunk.into_iter().for_each(|&idx| {
                    let value = calc_value(list[idx]);
                    unsafe {
                        let out = (output_list.as_ptr() as *mut f32).offset(idx as isize);
                        *out = value;
                    }
                });
            });
        });
    });
    output_list
}

The API says that thread::scope() only returns once each thread spawned by the scope it creates has finished. However, the compiler is telling me that the temporary range object (0..list.len()) is deconstructed while the threads that use it might still be alive.

I'm curious about what's actually happening under the hood. My intuition tells me that each thread spawned and variable created within Block S would both have Block S's lifetime. But clearly the threads have a lifetime longer than Block S.

Why aren't these lifetimes be the same?

Is the best practice here to create a variable in Block F that serves the purpose of the temporary like so:

fn process_list(list: Vec<f32>) -> Vec<f32> { // Block F
    let chunk_size = 100;
    let output_list = vec![0.0f32;list.len()];
    let range = (0..list.len()).collect::<Vec<_>>();

    thread::scope(|s| { // Block S
        range.chunks(chunk_size).for_each(|chunk| { // Block T
            s.spawn(|| {
                chunk.into_iter().for_each(|&idx| {
                    let value = calc_value(list[idx]);
                    unsafe {
                        let out = (output_list.as_ptr() as *mut f32).offset(idx as isize);
                        *out = value;
                    }
                });
            });
        });
    });
    output_list
}

CodePudding user response:

You don't have to ask SO to find out what happens under the hood of std functions, their source code is readily available, but since you asked I'll try to explain a little more in the comments.

pub fn scope<'env, F, T>(f: F) -> T
where
    F: for<'scope> FnOnce(&'scope Scope<'scope, 'env>) -> T,
{
    // `Scope` creation not very relevant to the issue at hand
    let scope = ...;

    // here your 'Block S' gets run and returns.
    let result = catch_unwind(AssertUnwindSafe(|| f(&scope)));
    // the above is just a fancy way of calling `f` while catching any panics.

    // but we're waiting for threads to finish running until here
    while scope.data.num_running_threads.load(Ordering::Acquire) != 0 {
        park();
    }
    // further not so relevant cleanup code
    //...
}

So as you can see your assumption that 'Block S' will stick around as long as any of the threads is wrong.


And yes the solution is to capture the owner of the chunks before you call thread::scope. There is also no reason to dive into unsafe for your example, you can use zip instead:

fn process_list(list: Vec<f32>) -> Vec<f32> { // Block F
    let chunk_size = 100;
    let mut output_list = vec![0.0f32; list.len()];
    let mut zipped = list
        .into_iter()
        .zip(output_list.iter_mut())
        .collect::<Vec<_>>();

    thread::scope(|s| { // Block S
        zipped.chunks_mut(chunk_size).for_each(|chunk| { // Block T
            s.spawn(|| {
                chunk.into_iter().for_each(|(v, out)| {
                    let value = calc_value(*v);
                    **out = value;
                });
            });
        });
    });
    output_list
}
  • Related