How to avoid copy/memory overhead when wrapping a stack allocated object?-CodePudding

Let's say I have a large blob object that is stack allocated. I need to put that in a wrapper object but I want to avoid a copy. Should I just use std::move with a move constructor? What would be the easiest way to prove that it works?

struct Blob {
    char blob[1024 * 1024]; // imagine something big here
};

template <typename T>
struct Foo {
    Foo(T&& src) : data{src} {}
    T data;
};

int main() {
    Blob blob;
    Foo foo{std::move(blob)};  // do not copy
    // should not take twice the memory of a blob
}

CodePudding user response：

In this case, Blob is plain old data. The compiler is free to optimize the variable Blob blob out of existence.

It is also free to make a million copies of it for no reason whatsoever. The standard does not constrain it.

There is an optimization called "static single assignment" that represents local object states as independent existence variables, which allows the compiler to get rid of a pile of nonsense copies or other state changes that don't matter. Sufficiently complex code, or reference/pointer leaks outside of the scope of optimization, block it.

So, in practice, just don't block the compiler from optimizing Blob blob's existence away.

That being said

Foo(T&& src) : data{src} {}

this should read

Foo(T&& src) : data{std::forward<T>(src)} {}

Also, you could do a

Foo<Blob&> foo(blob);

and have Foo store a reference to the Blob data. This changes the "value semantics" of Foo in nasty ways however.

Note that your implicit deduction guide is equally crazy, in that if you pass an lvalue you'll get a Foo wrapping a reference, but if you pass an rvalue you get a Foo wrapping a value.

If you want a hard guarantee, C doesn't provide that. C doesn't even guarantee that Blob blob; actually takes up space on the stack.

Going further, you can do this:

int main() {
  Foo foo{[&]{
    Blob blob;
    return blob;
  }()};
}

in this case, Blob blob's existence is elided into the return value, which in turn is passed to foo, then the lifetime of the return value ends at th end of the full-expression. Instead of a lambda, you can also use another function outside of main.

This makes it a bit easier for the compiler to work out that there is no point in a separate Blob object, but not be a huge amount.

If you want to make elision of the Blob blob directly into the Foo data be more guaranteed, add a Foo constructor that takes a Blob factory:

template <typename T>
struct Foo {
    template<class U>
    requires std::is_same_v< std::decay_t<U>, T >
    Foo(U&& src) : data{std::forward<U>(src)} {}
    template<class F>
    requires std::is_invocable_r_v< T, F&& >
    Foo(F&& f) : data(std::forward<F>(f)()) {}
    T data;
};
template<class T>
requires !std::is_invocable_v< T&& >
Foo(T&&)->Foo<std::decay_t<T>>;
template<class F>
requires std::is_invocable_v< F&& >
Foo(F&&)->Foo<std::decay_t<std::invoke_result_t<F&&>>>;

int main() {
  Foo foo{[&]{
    Blob blob;
    return blob;
  }};
}

which is probably going way, way too far.

CodePudding user response：

Either:

Keep reference to original object in wrapper and use it through this.
Use NRVO to avoid copying. E.g.

struct Blob {
    char blob[1024 * 1024]; // imagine something big here
};

template <typename T>
struct Foo {
    template<typename = std::enable_if_t<std::is_default_constructible<T>::value>>
    Foo(): data(){}
    T data;
};

Foo<Blob> ProcessBlob(){
   Foo<Blob> value{};
   Blob& blob = value.data;
   // Use `blob` here
   return value; // NRVO would eliminate copying
}

void ProcessIn2Steps(){
   Foo<Blob> wrapped = ProcessBlob();
   // Use `wrapped` here
}

Also, if your really want to avoid copies, consider removing copy constructors of Foo:

template <typename T>
struct Foo {
    template<typename = std::enable_if_t<std::is_default_constructible<T>::value>>
    Foo(): data(){}
    Foo(const Foo&) = delete;
    Foo(Foo&&) = delete;
    Foo& operator=(const Foo&) = delete;
    Foo& operator=(Foo&&) = delete;
    T data;
};