Home > Software design >  Function argument pass-by-value faster than pass-by-reference?
Function argument pass-by-value faster than pass-by-reference?

Time:06-03

I had an interview, where I get the following function declaration:

int f1(const std::vector<int> vec)

I suggested that instead of copying the vector, we should use a const-reference(not to mention that const copy does not makes much sense), but the interviewer claimed that the compiler copy-elision will handle it. I couldn't come up with any strong argument at the spot, but today I did some research.

I implemented the following two simple example functions:

int f1(const std::vector<int> vec) {
    const auto num = vec.size();
    return num * num;
}


int f2(const std::vector<int>& vec) {
    const auto num = vec.size();
    return num * num;
}

From the enter image description here

I suspected that maybe copy-elision is not allowed because of benchmark::DoNotOptimize(result);, but after removing it, I received a similar result.

Now I have these results, but I think it is still not convincing enough.

What do you think?

Do you have any good argument for using one over the other?

CodePudding user response:

Just assuming copy-elision is not a good idea: copying a vector is very expensive involving memory management and potentially copying a lot. Risking this copy over some marginal gain that you could have is not a good bet unless you know the behavior of your current and future compilers (copy-elision may be done, but is not mandated). One way to check whether copy elision was actually done is to check the pointer to the vector data on the caller side and in the function. Even if speed is not the greatest concern, more programmers expect the pass-by-const-ref over pass-by-value and thus don't have to question the reason for the decision.

Mind you that copy-elision is mandatory for initialization (from the link: when the initializer expression is a prvalue of the same class type (ignoring cv-qualification) as the variable type)

See also on cppreference

CodePudding user response:

Looking at the assembly of the functions f1 and f2 won't work. With optimizations enabled they are likely to look completely identical.

The performance difference here doesn't lie inside the function, but in the potential caller.

Function parameters are constructed in the context of the caller, not in the context of the callee. So, here with if f1 is called like

std::vector<int> vec{/*...*/};
auto res = f1(vec);

will cause vec to be copied into the function parameter in the caller. It is not possible to elide the copy. (Of course the compiler is free to optimize the copy away if it can see that doing so won't change the observable behavior of the program, but that generally can't happen if the function isn't inlined, for example because it's definition is in another translation unit and there is no link-time optimization.)

With

std::vector<int> vec{/*...*/};
auto res = f2(vec);

no copy is done, even conceptually. Copy elision is not relevant.

It is true that e.g.

std::vector<int> vec{/*...*/};
auto res = f1(std::move(vec));

will use the move constructor and will generally be cheap enough that it might be faster than using f2 under some circumstances where the function can't be inlined. (However that will depend on how the ABI specifies that a std::vector is passed to functions as well.)

It is also true that e.g.

auto res = f1(std::vector<int>{/*...*/});

will construct only one std::vector<int> if copy elision is applied (guaranteed since C 17 and optionally before). However, the same is true if f2 was used.

So all in all I would say, in some circumstances they are equally good, in some circumstances f2 is definitively the better choice and in some rare situations it might be that f1 is the better choice. I would go with f2 by default.

However, looking at the bigger picture, I would try to avoid having a function which iterates over a vector take the vector itself as parameter. Functions like this can typically be written more generally to apply to arbitrary ranges and hence it would probably be easy to make them templates and use either the interator interface or the C 20 range interface used by standard library algorithms. This way, if in the future someone decides to use some other container than std::vector, the function will still just work.

  • Related