Home > Enterprise >  What does it mean to be generic over a closure type in Rust?
What does it mean to be generic over a closure type in Rust?

Time:12-28

What does it mean when a type parameter is bounded by an Fn type?

fn call<F: Fn() -> u64>(f: F) -> u64 {
    f()
}

The Rust Book says that functions with type parameters are monomorphosed: https://doc.rust-lang.org/stable/book/ch10-01-syntax.html?highlight=generic function#performance-of-code-using-generics.

If that's the case, then I'd expect a new version of call to be generated for each new closure type it is instantiated with.

According to the Rust Reference, each closure type is distinct, even if the function body is the same: https://doc.rust-lang.org/reference/types/closure.html.

So when I look at the compiled output for this program, I'd expect the compilation artifact to be larger when I call call with 1799999999999999999 distinct instantiations of F as compared to just four instantiations, but that's not what happens.

(1)

// `call` is defined above

fn make_closure(n: u64) -> impl Fn() -> u64 {
   move || n 
}

fn main() {
    let results = (0..17999999999999999999).map(make_closure).map(call);
    for r in results {
        println!("{}", r)
    }
}

So what's the right mental model for what fn call<F: Fn() -> u64> means? Should I think of the lack of code bloat as merely an optimization?

It gets harder for me to sustain this mental model when the closure is built from user input, though:

(2)

fn main() {
    let mut buf = String::new();
    std::io::stdin().read_line(&mut buf).unwrap();
    let n = buf.trim().parse::<u64>().unwrap();
    println!("{}", call(make_closure(n)));
}

So what's the right way to think about what the signature of call means?

Update

Adding more information so we can reference it from comments:

(3)

The Rust Reference says:

A closure expression produces a closure value with a unique, anonymous type that cannot be written out. A closure type is approximately equivalent to a struct which contains the captured variables.

(4) The first line below is accepted by rustc, the second is not:

    vec![make_closure(1), make_closure(2)]; // OK
    vec![(|| 1), (|| 1)];                   // Error: mismatched types

CodePudding user response:

I think you are mixing concepts here. Indeed, every function in Rust has its own type, and so does every closure, even if they have the same body. So in this example:

fn id0(x: u64) -> u64 { x }
fn id1(x: u64) -> u64 { x }
fn main() {
    let id2 = || 1;
    let id3 = || 1;
}

every id0, id1, id2 and id3 have distinct types, all of them implement the Fn() -> u64 trait. If you write call(id0) or call(id1) you will get monomorphized versions of that function that call id0 or id1 statically and directly, without doing any indirect function pointer call.

And since they have an empty capture set, they can also be coerced into the non-generic function type: fn() -> u64. But if you force that, then you lose that monomorphization: call(id0 as fn() -> u64) and call(id1 as fn() -> u64) both call to the same instantation of the call() function, that calls the inner closure indirectly using the given pointer.

Then, in your example:

fn make_closure(n: u64) -> impl Fn() -> u64 {
   move || n 
}

there is only one closure. You can call this function with any number, and that number is captured by the closure, but that does not alter the type of the returned value. That type is determined at compilation time, always the same.

A different thing would be if you use a constant generic argument:

fn make_closure_gen<const N: u64>() -> impl Fn() -> u64 {
   || N
}

Now for every N you have an instantiation of that generic function, and each one with a different return type. But you cannot create a runtime loop that instantiates 1000000 of these functions, because N must be constant. (You can try a compile-time loop, however).

Anyway, even if you manage to create millions of instantiations of the same function, if the compiler detects that the inner code of several functions is the same it can merge them into a single one, (sometimes call code deduplication), so that the final executable does not blow up. The details vary from version to version of the compiler, and it can even depend on the linker used. It can even apply to non-generic functions!

CodePudding user response:

Just like normal types, closure types can also be inhabited by multiple values. move || n has the same type, regardless of the value of n (the closure type is determined by the body of the closure, not what the captured values are). So, your 17999999999999999999 closures are all of the same type despite having a different captured value of n, thus only needing one instance of call. If you actually have 2 different closures like in the following code:

fn make_closure(n: u64) -> impl Fn() -> u64 {
   move || n 
}

fn make_closure2(n: u64) -> impl Fn() -> u64 {
   move || n   1
}

fn call<F: Fn() -> u64>(f: F) -> u64 {
    f()
}

pub fn main() {
    call(make_closure(5));
    call(make_closure2(5));
}

You can see two different versions of example::call in the decompilation.

  • Related