Code like
std::vector<int> a;
for(size_t i = 0; i < n; i)
a.push_back(0);
is guaranteed to run in linear time in n
. This is achieved by allocating some additional spare capacity when reallocating (typically increasing the total capacity by a constant factor).
But what about std::vector::insert(pos, x)
? I.e. is it guaranteed that
std::vector<int> a;
for(size_t i = 0; i < n; i)
a.insert(a.end(), 0);
is linear as well? (similar question applies to insert(pos, first, last)
of course)
The documentaiton of push_back
is clear in guaranteeing "amortized constant" complexity.
But the documentation of insert
says the complexity should be "constant plus linear in the distance between pos and end of the container". This can obviously only be true if no reallocation happens, thus my uncertainty.
EDIT: Summary of the answers: When a reallocation occurs, the implementation can either
- increase the capacity the bare minimum
- or increase the capacity more than the minimum, thus making future insertions faster.
In the case of push_back
, option (2) is essentially required to achieve the "amortized constant" runtime. In the case of insert(pos, first, last)
, option (2) is required in case of single-pass InputIterator (But in case of multi-pass ForwardIterator, the implementation can and does use a single reallocation with new_capacity = max(2*old_capacity, new_size)
). All other cases are implementation defined.
Testing with GCC 11 and clang 12, the situation seems to be:
reserve
andassign
increase capacity by the bare minimum, andpush_back
andinsert
andresize
increase the capacity by a constant factor
CodePudding user response:
The actual language is
[vector.overview] A vector is a sequence container that supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time
and more specifically
[vector.modifiers 1] Complexity: If reallocation happens, linear in the number of elements of the resulting vector; otherwise, linear in the number of elements inserted plus the distance to the end of the vector.
but we both agree it's broadly linear.
When reallocation happens, the options are:
reuse the same internal mechanism for growing the vector as in
push_back()
:this is obviously still amortized constant time, so the linear-time element move still dominates
grow the vector by just one element:
since this is still linear time, it's still within the complexity allowed. However, it is actually more effort for the implementer, and objectively worse
I don't think I've ever seen an implementation that didn't just reuse some internal grow()
method for both of these, but it would be technically legal to spend more effort doing something worse.
The exception to this reasoning is the range overload insert(iterator pos, InputIterator begin, InputIterator end)
since, for a strict LegacyInputIterator, you can only do one pass.
If you can't pre-calculate the number of elements to grow by, any growth must be amortized constant time, or the overall complexity would be governed by N * distance(begin,end), which could be O(N²)
and thus non-conformant.
tl;dr
push_back
must allocate extra capacity to remain amortized constant timeinsert(pos, begin, end)
must use amortized constant-time growth for each element of(begin,end]
to remain amortized linear time overallinsert(pos, value)
does not need to allocate extra capacity in order to meet its complexity requirement, but it's probably more effort for the implementer to get a worse result
CodePudding user response:
"Amortized constant" means "constant most of the time". push_back
will be O(1) for most of the calls. If reallocation is required, it will be linear, but this should happen relatively rarely.
If reallocation is required for insert
, all elements will be moved, that's O(n). If reallocation is not required, all the elements to the right of newly inserted elements are moved, but that's still O(n).
Cppreference doesn't seem to mention reallocation in Complexity section reliably. For example, resize()
mentions possible additional complexity in Complexity section, but push_back()
and insert()
mention reallocation only in the function description. Might be worth to clean it up one day.