I have been looking through GCC implementation of std::function
(got there while debugging and went off on a tangent).
From what I can see it stores small types inside local storage and anything that does not fit it allocates via new operator.
However the constructor also does check the __location_invariant
metafunction, which is a wrapper around std::trivially_copyable
trait, and if it isn't "location invariant", it also allocates it on the heap.
I do not completely understand why does it do it, as from what I understand
::new (storage) T(args)
should provide the same result as
new T(args)
With the exception that the in-place constructor does not allocate any memory.
It would make more sense to me if it, for example, used a single reference-counted object to store the "location invariant" types that are too large to fit in local storage, as that would decrease amount of allocations and copying. With non-invariant objects being allocated and copied every time, as being "location-dependant" they cannot all refer to the same storage.
The implementation does seem to just heap-allocate anything that does not fit and/or is not location invariant (at least i did not see it doing so?), so I am quite confused why does it need to check for location invariance, if there is no apparent difference in functionality.
CodePudding user response:
It appears that libstdc uses the "location-invariant" property to simplify certain operations on std::function
. Namely, the storage for the callable is provided by a union named _Any_data
which contains a char array. This char array can either provide storage for a pointer to the actual callable (in case it's allocated on the heap), or the callable itself (in case it qualifies for the small object optimization). When std::function
is move-constructed, the _Any_data
member of the RHS only needs to be trivially copied over to the _Any_data
member of *this
(plus the RHS has to be indicated to be null somehow). This works both when _Any_data
stores a pointer to a heap-allocated callable (since the pointer is trivially copyable) and when it stores a small callable inline (since the callable in this case is required to be trivially copyable). Similarly the swap operation on std::function
may be implemented as a trivial swap of the _Any_data
members, and the copy/move assignment operations are both implemented using the copy-swap idiom.
It's possible to be somewhat more generous: the small object optimization could theoretically be supported for any callable type that is either nothrow-copy-constructible or nothrow-move-constructible. [1] However, in the case that the type is not trivially copyable, this imposes additional complexity on the implementation. Consider how to write the move constructor of std::function
in case the RHS may store inline an object that is not trivially copyable. In this case the copy constructor of such callable must be conditionally called depending on whether the stored metadata indicate that such constructor is nontrivial. This is not difficult to implement: an additional method simply must be added to the manager object. However, it implies that an extra indirect function call must be performed every single time a std::function
is move-constructed. In the case of a swap operation, 3 such calls would be required.
The trade-off that the implementor needs to make is whether types that fit into the small object buffer and are nothrow-movable (but not trivially copyable) are common enough that the benefit of allowing them to be stored in the small object buffer outweighs the costs of additional indirect function calls used by the move and swap operations for all callable types.
[1] The reason (or at least one reason) why this requirement is necessary is that swapping two std::function
objects is required to always succeed. The swap operation must actually relocate any values that are stored inline (as opposed to ones that are on the heap, in which case ownership over the pointer may simply be transferred to the other std::function
object). If the underlying copy or move involved in such relocation is not noexcept
, then it's impossible to guarantee that the swap will succeed; therefore heap allocation is the only option.