Home > front end >  Why do we need to make a local copy of smart_ptr first before passing it to other functions?
Why do we need to make a local copy of smart_ptr first before passing it to other functions?

Time:11-16

CppCon 2015: Herb Sutter "Writing Good C 14... By Default" Slide 50

Live on Coliru

I have seen the following guideline through above talk. However, I have hard time to understand the critical issues both solutions try to solve in the first place. All comments on the right side of the code are copied from original talk. Yes, I don't understand the comments on the slide either.

void f(int*);
void g(shared_ptr<int>&, int*);
shared_ptr<int> gsp = make_shared<int>();

int main()
{
  // Issue 1>
  f(gsp.get()); // ERROR, arg points to gsp', and gsp is modifiable by f
  // Solution 1>
  auto sp = gsp;
  f(sp.get());  // ok. arg points to sp', and sp is not modifiable by f

  // Issue 2>
  g(sp, sp.get());  // ERROR, arg2 points to sp', and sp is modifiable by f
  // Solution 2>
  g(gsp, sp.get()); // ok, arg2 points to sp', and sp is not modifiable by f
}

Can someone please give me some advices what the problems are if we write code shown in Issue 1 and Issue 2 and why the solutions fix the problems?

CodePudding user response:

f could be written like this:

void f(int* p)
{
  gsp = nullptr;
  *p = 5;
}

Since gsp is the only shared_ptr that owns the int, if it is cleared, then the pointer given to p is destroyed.

Basically, it is reasonable for functions which are given non-owning pointers (or references) to expect that the objects being pointed to are owned by someone outside of themselves, and that there is nothing they can do to cause them to be destroyed.

g is something similar, though with a reference as a parameter rather than a modifiable global:

void g(shared_ptr<int> &sp, int* p)
{
  sp = nullptr;
  *p = 5;
}

Both of these are warnings about making sure that, if some code passes an owned object to a function without explicitly transferring ownership, that function must ensure that there is nothing the called function can do to destroy that owned object by accident. In real-world scenarios, these will be fair more complex, as far as determining what has actually happened.

For cases of g, the parameter might be a reference to some complex object which happens to have the last shared_ptr to the int you also passed as a parameter. For f, the global object might be some "singleton" that manages a bunch of things, including ownership of objects, and this function just so happened to be passed a pointer to such an object. Inadvertently calling the wrong function from f or g can cause the pointer to become invalid.

CodePudding user response:

Raw pointers have caused a lot of problems in C . The key problem is that their lifetime is totally unclear. If I have a T*, then I have no idea how long that T is supposed to live, or if I'm supposed to free it, or if the caller will free it. Messing that up can result in double frees, memory leaks, and in the worst case undefined behavior.

So Herb Sutter, in his presentation, places some basic rules on the use of raw pointers. In particular, he assumes implicitly that raw pointers are always borrowed. That is, a T* never owns the underlying T and hence should never free it. (If you, for some reason, need an owned T* and can't use a smart pointer, the C Core Guidelines recommend using an owned<T*> type to make your intentions clear, even if the owned type itself doesn't change anything semantically)

So a T* is borrowing data. If we borrow data, we need to know that it will live at least as long as the place we're borrowing it. That's where the top of Slide 50 comes in.

In callee, assume Pointer params are valid for the call, and independent.

So, under Sutter's convention, a function that takes a T* should assume that the T* is good for at least until the function returns.

Now, if I pass the innards of a shared_ptr, it's possible that I'll invalidate it accidentally, without even knowing about it.

shared_ptr<int> gsp = make_shared<int>();

void f(int* my_pointer) {
  gsp = some_other_data;
  // Oh no! my_pointer is suddenly garbage!
}

int main() {
  f(gsp.get());
}

At the "Oh no!" point in the code above, dereferencing my_pointer is undefined behavior. But we couldn't have possibly known that, since we never actually did anything with that pointer, just with a (theoretically unrelated) pointer.

The data a shared_ptr points to is only freed when all of the shared pointers referencing it are gone. So by creating a local variable in main that also points to the same data, we ensure that the data won't be freed until main ends, even if the function accidentally invalidates the one reference that it does have access to.

The g example is showing that the same thing can happen even without global variables. If a function takes a std::shared_ptr<T> and a T* that reference the same data, then the former could invalidate the latter unless there's another shared_ptr<T> somewhere else that the callee doesn't have access to that's anchoring it into position.

  •  Tags:  
  • c
  • Related