Why allocation and sort of std::pair is faster than std::vector?-CodePudding

Today I just tried to solve a problem in programming. I noticed that allocation and sorting of the vector<vector> are much much slower than vector<pair<int, pair<int, int>>. I took some benchmarks and came to know that nested vector code is 4x slower than nested pair code for the given input (https://pastebin.com/izWGNEZ7).

Below is the code I used for benchmarking.

    auto t_start = std::chrono::high_resolution_clock::now();
    vector<pair<int, pair<int, int>>> edges;
    for (int i = 0; i < points.size(); i  )
        for (int j = i   1; j < points.size(); j  )
            edges.push_back({abs(points[i][0] - points[j][0])   abs(points[i][1] - points[j][1]), {i, j}});
    sort(edges.begin(), edges.end());
    auto t_end = std::chrono::high_resolution_clock::now();
    double elapsed_time_ms = std::chrono::duration<double, std::milli>(t_end - t_start).count();
    cout << elapsed_time_ms << endl;

    auto t_start1 = std::chrono::high_resolution_clock::now();
    vector<vector<int>> edges1;
    for (int i = 0; i < points.size(); i  )
        for (int j = i   1; j < points.size(); j  )
            edges1.push_back({abs(points[i][0] - points[j][0])   abs(points[i][1] - points[j][1]), i, j});
    sort(edges1.begin(), edges1.end());
    auto t_end1 = std::chrono::high_resolution_clock::now();
    double elapsed_time_ms1 = std::chrono::duration<double, std::milli>(t_end1 - t_start1).count();
    cout << elapsed_time_ms1 << endl;

Output:

241.917
1188.11

Does anyone know why there is a big difference in performance?

CodePudding user response：

A std::pair or std::array has a fixed size known at compile time and will include the objects directly in the class itself. A std::vector on the other hand has to deal with dynamic size and needs to allocate a chunk of memory on the heap to hold the objects.

For small objects the std::pair or std::array will be better because the overhead of allocating and freeing memory will eat into your performance. That's what you are seeing. The extra indirection involved with the pointer will also cost you when e.g. comparing the elements as well as having to check the size at run time.

On the other hand for large objects the std::vector should be better because it supports move semantics. Swapping 2 vectors will just swap the pointer to the data while std::pair or std:array will have to move/copy each element, which would be costly for large objects.

So what you see is not that pair is faster than vector but that pair is faster than vector in that use case.