Lambda captures vs parameters - any performance difference?-CodePudding

auto lambdaFunction1 = [](someClass& obj, int x){/**/};
auto lambdaFunction2 = [&someClassObj, x](){/**/};

Will there be any performance difference for using lambda that captures variables or passing them as parameters? If I'm in position that I can use whatever, should I always prefer one of those options over other? Are there any rules about that? Or it's just a matter of which one likes?

PS: I'm aware that in same cases I will have don't have such choice, for example using stl algorithms, I'm asking about situation where I can use both

CodePudding user response：

A rough answer:

Are you converting the lambdas into objects (like std::function) at some point?

If not, and every lambda usage is through a template, most likely it is all inlined and it just doesn't matter. Most STL taking a lambda parameter work like that and it is very fast. In this case use whatever is easier to read and comprehend in the future.

If you do convert it into a function object, then using a regular function pointer instead of std::function is cheaper, because there are no dynamically-allocated capture objects hanging with the function itself.

A more accurate answer: benchmark it!

CodePudding user response：

Capture and parameters are simply different in nature and are optimized differently.

Let's look at your lambdas under the hood:

struct lambdaFunction1_type {
    auto operator()(someClass& obj, int x) const { /* ... */ }
} lambrdaFunction1{};

struct lambdaFunction2_type {
    someClass& obj;

    auto operator()(int x) const { /* ... */ }
} lambdaFunction2{someClassObj};

As you can see, capture acts like class data members, and parameter acts like... well, a parameter. Both serves different purposes.

Both also use different inlining processes. As always, compiler optimizations are much less effective when using a polymorphic wrapper such as std::function. The compiler may not optimize away the size of a dynamic allocation without heap elision, so it might affect how much optimization is possible.

If I'm in position that I can use whatever, should I always prefer one of those options over other? Are there any rules about that?

I'd say always go with what's the clearest for you. If you have concerns over performance, measure, benchmark and profile.

You can also go check the assembly some code generates.