I started with this piece of code to show the effects of NRVO:
struct test_nrvo {
bool b;
long x[100]; // Too large to return in register
};
test_nrvo tester(test_nrvo* p) {
test_nrvo result{ p == &result };
return result;
}
bool does_nrvo() {
test_nrvo result(tester(&result));
return result.b;
}
int main() {
__builtin_printf("nrvo: %d\n", does_nrvo());
}
Which happily prints nrvo: 1
at -O0
on my compiler.
I tried to do a similar thing for RVO for returning a prvalue:
struct test_rvo {
bool b;
test_rvo(test_rvo* p) : b(p == this) {}
};
test_rvo tester(test_rvo* p) {
return test_rvo(p);
}
bool does_rvo() {
test_rvo result(tester(&result));
return result.b;
}
int main() {
__builtin_printf("rvo: %d\n", does_rvo());
}
https://godbolt.org/z/abh9v14bv (Both of these snippets with some logging)
And this prints rvo: 0
. This implies that p
and this
are different pointers, so there are at least two different objects (and it is moved from).
If I delete the move constructor or make it non-trivial, this prints rvo: 1
, so it seems like this elision is allowed.
Confusingly the constexpr version fails in GCC, even with a deleted move constructor (where there absolutely could not have been two different objects, so it seems like a GCC bug). Clang seems to just have different behaviour to the non-constexpr version, so it seems like this is permitted-but-not-mandatory copy elision. Is that right? If so, what makes a function call different from other prvalues that must have their moves elided?
CodePudding user response:
Per [class.temporary]/3 RVO on the return value of a function has an exception.
If the type has at least one eligible copy or move constructor, all of them trivial, and if it has a trivial or deleted destructor, then the compiler is allowed not to apply RVO and create a temporary in which the function return value is held before it is copied/moved to initialize the variable.
test_rvo
as shown satisfies these requirements and therefore RVO is not mandatory. With a deleted move constructor the requirements are not satisfied anymore so RVO must apply. That GCC does not do so in constant evaluation context seems like a bug to me as well. In constant evaluation context optional copy elision is never applied ([exp.const]/1), but that shouldn't affect the mandatory elision here. (I am not sure how [class.temporary]/3 is supposed to be handled in constant expressions though, since it is not mentioned.)
The Itanium C ABI specifies the type property non-trivial for the purpose of calls which mostly overlaps with the (inverse) of the condition in [class.temporary]/3. A return type without this property is passed according to the base C ABI, instead of having RVO applied as the Itanium C ABI specifies.
The underlying base C API is here the System V x86-64 psABI, which specifies (in 3.2.3 Returning of Values on page 24) that the return value for large structures (MEMORY class) is passed indirectly. Similarly to RVO the caller provides storage for the return value and passes a pointer to that space as additional argument to the function. However the ABI specification also includes the sentence
This storage must not overlap any data visible to the callee through other names than this argument.
Which I guess is the reason that an additional copy needs to be performed. The ABI requires here that the explicitly provided pointer argument doesn't refer to the implicitly reserved storage for the return value, making RVO impossible.