How does dynamic arrays(std::vector) work in c ?-CodePudding

According to what I heard std::vector stores data contiguously and when you try to add or delete an element to it grow or shrink by allocating new memory and copies everything from the old memory to the new memory with the changes and deletes the old memory

in the code below I have a copy constructor it says copied every time the class gets copied it gave me 6 copied messages when I had 3 elements and which was fine but when I added another element it gave me 7 messages

Isn't it supposed to give me 10? And how to optimize that?

#include <iostream>
#include <vector>

struct Vertex
{
    int x, y, z;

    Vertex(int x, int y, int z) : x(x), y(y), z(z)
    {
    }

    Vertex(const Vertex &vetrex) : x(vetrex.x), y(vetrex.y), z(vetrex.z)
    {
        std::cout << "Copied" << std::endl;
    }
};

int main()
{
    std::vector<Vertex> vertices;
    vertices.push_back({1, 2, 3});
    vertices.push_back({4, 5, 6});
    vertices.push_back({7, 8, 9});
    vertices.push_back({10, 11, 12});
}

CodePudding user response：

Let's step through it:

std::vector<Vertex> vertices;

At this point, capacity is 0, size is 0.

vertices.push_back({1, 2, 3});

Vector allocates a buffer of size 1 to hold the element. copy construct argument passed to push_back into vector [1/7 copies]

vertices.push_back({4, 5, 6});

vector capacity is 1, size is 1. It is full, so to add a 2nd element it allocates a new buffer double in size (now 2).

copy the existing element in vector into new memory [2/7 copies]
copy construct argument passed to push_back into vector [3/7 copies]
delete old buffer (and destroy objects in it)

then:

vertices.push_back({7, 8, 9});

vector capacity is 2, size is 2. Again it is again full and doubles its buffer size by allocating a buffer of size 4.

copy 2 elements from old allocation to new allocation [(4&5)/7 copies]
copy argument passed to push_back into vector [6/7] copies
delete old buffer (and destroy objects in it)

then:

vertices.push_back({10, 11, 12});

Vector capacity is 4, size is 3. It has room for the new element without resizing.

copy argument passed to push_back into vector [7/7] copies

Note, the vector "doubling" is common but not required by the standard. Implementations could resize to other amounts, so do not depend on anticipating the reallocation size.

Also, to reduce the number of reallocations, you can reserve size in advance of all the insertions. In that case it's FAR more efficient:

std::vector<Vertex> vertices;
vertices.reserve(4);

If you add the call to reserve space for 4 elements at the beginning, then there are only 4 copies--one per element added into the vector.

CodePudding user response：

According to what I heard std::vector stores data contiguously and when you try to add or delete an element to it grow or shrink by allocating new memory and copies everything from the old memory to the new memory with the changes and deletes the old memory

std::vector<T> doesn't actually grow every single time you add an element to it. It has a notion of something called capacity (See std::vector<T>::capacity()).

The actual number that grows or shrinks every single time an element is added is the std::vector<T>::size() which is just a restricting boundary over the capacity. The capacity is what actually tells you how many elements have been allocated at the time.

Essentially, in most implementations, the capacity of a std::vector<T> grows by powers of 2. (There is a reason behind doing this and this is mostly related to performance.)

So, examine the capacity in each of the push_back() calls to see this:

int main()
{
    std::vector<Vertex> vertices;
    std::cout << "Size: " << vertices.size() << std::endl;         // Size: 0
    std::cout << "Capacity: " << vertices.capacity() << std::endl; // Capacity: 0
    vertices.push_back({1, 2, 3});                                 // 1 copy 
    std::cout << "***" << std::endl;

    std::cout << "Size: " << vertices.size() << std::endl;         // Size: 1
    std::cout << "Capacity: " << vertices.capacity() << std::endl; // Capacity: 2^0 = 1
    vertices.push_back({4, 5, 6});                                 // 2 copies (due to reallocation)
    std::cout << "***" << std::endl;

    std::cout << "Size: " << vertices.size() << std::endl;         // Size: 2
    std::cout << "Capacity: " << vertices.capacity() << std::endl; // Capacity: 2^1 = 2
    vertices.push_back({7, 8, 9});                                 // 3 copies (due to reallocation)
    std::cout << "***" << std::endl;

    std::cout << "Size: " << vertices.size() << std::endl;         // Size: 3
    std::cout << "Capacity: " << vertices.capacity() << std::endl; // Capacity: 2^2 = 4
    vertices.push_back({10, 11, 12});                              // 1 copy
    std::cout << "***" << std::endl;
}

As you can see, in the fourth time you call push_back(), the capacity of vertices is 4 is and its size is 3 so it conveniently just does one copy and doesn't have to reallocate anything as there is already enough space.

And how to optimize that?

If you already know how many elements are going to be added beforehand, you could call std::vector<T>::reserve() before any push_back() calls:

vertices.reserve(4);

This avoids any sort of reallocation and simply does 4 copies.