Home > front end >  Performance of smart pointer and raw pointer in containers
Performance of smart pointer and raw pointer in containers

Time:12-22

I'm curious about the answer to this question as I mostly work with containers. which one is more logical to use in minimum of 100 (and maximum of 10k) elements in vector or map container in?

  • std:::vector<std::unique_ptr<(struct or class name)>>
  • std:::vector<std::shared_ptr<(struct or class name)>>
  • std:::vector<(struct or class name)*>

Machine detais: FreeBSD 12.1 clang-devel or gcc11.

CodePudding user response:

Start with correct behavior, not performance.

  1. Does your container own your objects? If no, use raw pointers. If yes, use smart pointers. But which ones? See below.
  2. Do you need to support several containers containing the same object, and is it unclear which container will be deleted first? If the answer to both is "yes", use shared_ptr. Otherwise, use unique_ptr.

Later, if you discover that accessing the smart pointers wastes too much time (unlikely), replace the smart pointers by raw pointers together with highly optimized memory management, which you will have to implement according to your specific needs.


As noted in comments, you could do it without pointers. So, before applying this answer, ask yourself why you need pointers at all (I guess the answer is polymorphism, but not sure).

CodePudding user response:

This is really opinion-based, but I'll describe the rules of thumb I use.

std:::vector<(struct or class name)> is my default unless I have specific requirements that are not met by that option. More specifically, it is my go-to option UNLESS all three of the following conditions are true;

  • struct or class name is polymorphic and instances of classes derived from struct or class name need to be stored in the vector.
  • struct or class name does not comply with the rule of three (before C 11), the rule of five (from C 11), OR the rule of zero
  • there are SPECIFIC requirements to dynamically manage lifetime of instances of struct or class name

The above criteria amount to "use std::vector<(struct or class name)> if struct or class name meets requirements to be an element of a standard container".

If struct or class name is polymorphic AND there is a requirement that the vector contain instances of derived classes my default choice is std:::vector<std::unique_ptr<(struct or class name)> >. i.e. none of the options mentioned in the question.

I will only go past that choice if there are special requirements for managing lifetime of the objects in the vector that aren't met by either std:::vector<(struct or class name)> or std:::vector<std::unique_ptr<(struct or class name)> >.

Practically, the above meets the vast majority of real-world needs.

If there is a need for two unrelated pieces of code to have control over the lifetime of objects stored in a vector then (and only then) I will consider std:::vector<std::shared_ptr<(struct or class name)> >. The premise is that there will be some code that doesn't have access to our vector, but has access to its elements via (for example) being passed a std::shared_ptr<(struct or class name)>.

Now, I get to the case which is VERY rare in my experience - where there are requirements to manage lifetime of objects that aren't properly handled by std:::vector<(struct or class name)>, std:::vector<std::unique_ptr<(struct or class name)> >, or by std:::vector<std::shared_ptr<(struct or class name)> >.

In that case, and only that case, I will - and only if I'm desperate - use std:::vector<(struct or class name)*>. This is the situation to be avoided, as much as possible. To give you an idea of how bad I think this option is, I've been known to change other system-level requirements in a quest to avoid this option. The reason I avoid this option like the plague is that it becomes necessary to write and debug EVERY bit of code that explicitly manages the lifetime of each struct or class name. This includes writing new expressions everywhere, ensuring every new expression is eventually matched by a corresponding delete expression. This option also means there is a need to debug hand-written code to ensure no object is deleted twice (undefined behaviour) and every object is deleted once (i.e. avoid leaks). In other words, this option involves lots of effort and - in non-trivial situations - is really hard to get working correctly.

CodePudding user response:

It's hard to provide a firm solution to your question without seeing the context and the way your struct/class operates.
But I still want to provide some basic info about smart pointers so hopefully, you can make a wise decision.

An example:

#include <iostream>
#include <vector>
#include <memory>

int main( )
{
    struct MyStruct
    {
        int a;
        double b;
    };

    std::cout << "Size of unique_ptr: " << sizeof( std::unique_ptr< MyStruct > ) << '\n';
    std::cout << "Size of shared_ptr: " << sizeof( std::shared_ptr< MyStruct > ) << '\n';
    std::cout << '\n';

    std::vector< std::unique_ptr<MyStruct> > vec1; // a container holding unique pointers
    std::vector< MyStruct* > vec2; // another container holding raw pointers

    vec1.emplace_back( std::make_unique<MyStruct>(2, 3.6) ); // deletion process automatically handled
    vec2.emplace_back( new MyStruct(5, 11.2) ); // you'll have to manually delete all objects later

    std::cout << vec1[0]->a << ' ' << vec1[0]->b << '\n';
    std::cout << vec2[0]->a << ' ' << vec2[0]->b << '\n';
}

The possible output:

Size of unique_ptr: 8
Size of shared_ptr: 16

2 3.6
5 11.2

Check the assembly output here and compare the two containers. As I saw, they generate the exact same code.

The unique_ptr is very fast. I don't think it has any overhead. However, the shared_ptr has a bit of overhead due to its reference counting mechanism. But it still might be more efficient than a handwritten reference counting system. Don't underestimate the facilities provided in the STL. Use them in most cases except the ones in which STL does not exactly perform the specific task you need.

Speaking of performance, std::vector<(struct or class name)> is better in most cases since all the objects are stored in a contiguous block of heap memory, and also dereferencing them is not required.
However, when using a container of pointers, your objects will be scattered around heap memory and your program will be less cache-friendly.

  • Related