Why are reference members of closure types needed?-CodePudding

[expr.prim.lambda.capture]/12:

An entity is captured by reference if it is implicitly or explicitly captured but not captured by copy. It is unspecified whether additional unnamed non-static data members are declared in the closure type for entities captured by reference. If declared, such non-static data members shall be of literal type.

The closure types have direct access to objects, so why are the reference members sometimes needed? It even only requires the members to be of literal type, why?

CodePudding user response：

This wording is talking about the struct layout of the closure type. A lambda-expression corresponds to a struct-whose-name-you-don't-know which is basically

class Unnameable {
    ~~~data members~~~
public:
    auto operator()(~~~args~~~) const { ~~~body~~~ }
};

where all of the ~~~ bits are filled in based on the form of the lambda. For example, the lambda [=](int x) { return x y; } (where y is an int captured from the outer scope) is going to correspond to a struct layout like this:

class Unnameable {
    int y;
public:
    auto operator()(int x) const { return x y; }
};

When the capture of y happens by copy, this transformation is pretty obvious. (The circumstances under which capture-by-copy happens are described a few lines above the part you quoted.) But, when the capture of y happens by reference, the transformation is not obvious. The compiler could just lower it into

class Unnameable {
    int& y;
public:
    auto operator()(int x) const { return x y; }
};

but then class Unnameable wouldn't be copy-assignable, and ~~in fact the closure object does need to be copy-assignable~~ [Oops — no, closures are not assignable! So actually I'm not sure why this wouldn't Just Work. But see the clever technique below, anyway]. So, the compiler actually lowers it into something more like

class Unnameable {
    int *py;
public:
    auto operator()(int x) const { return x   *py; }
};

This is why the wording is so coy about exactly what "additional unnamed non-static data members are declared in the closure type for entities captured by reference." All it guaranteed is that "If declared, such non-static data members shall be of literal type" (i.e., they won't accidentally make the closure object non-constexpr-friendly).

Now, the wording in the Standard actually allows a "sufficiently smart compiler" to go even further — although no vendor actually does, AFAIK. Suppose the lambda in question is this one:

int a = 1, b = 2;
auto lam = [&]() { return a   b; };
std::function<int()> f = lam;
a = 3;
assert(f() == 5);

Since the compiler happens to know the layout of a and b on the function's stack frame, it is totally permitted for the compiler to generate a closure type with this struct layout:

class Unnameable {
    int *p;  // initialized with &a, which happens to be &b-1
public:
    auto operator()() const { return p[0]   p[1]; }
};

The Standard could forbid this implementation by saying something like, "For each entity captured by reference, an unnamed non-static data member is declared in the closure type. If the entity is of type T, then the type of such a data member is 'pointer to remove_cvref_t<T>.'" But that would forbid this kind of clever implementation technique, so, they just left it unspecified.

CodePudding user response：

The requirement for literal type was added by P0170R1 (constexpr lambda). The intent was to ensure that by-reference captures are creatable and usable in constexpr.

I don't think reference members of closure types are strictly needed. Implementations may choose to add one pointer member for each by-reference capture, or even to add no member if it knows that such a capture is not used.