I did a bit of an experiment to try to understand references in C :
#include <iostream>
#include <vector>
#include <set>
struct Description {
int a = 765;
};
class Resource {
public:
Resource(const Description &description) : mDescription(description) {}
const Description &mDescription;
};
void print_set(const std::set<Resource *> &resources) {
for (auto *resource: resources) {
std::cout << resource->mDescription.a << "\n";
}
}
int main() {
std::vector<Description> descriptions;
std::set<Resource *> resources;
descriptions.push_back({ 10 });
resources.insert(new Resource(descriptions.at(0)));
// Same as description (prints 10)
print_set(resources);
// Same as description (prints 20)
descriptions.at(0).a = 20;
print_set(resources);
// Why? (prints 20)
descriptions.clear();
print_set(resources);
// Object is written to the same address (prints 50)
descriptions.push_back({ 50 });
print_set(resources);
// Create new array
descriptions.reserve(100);
// Invalid address
print_set(resources);
for (auto *res : resources) {
delete res;
}
return 0;
}
https://godbolt.org/z/TYqaY6Tz8
I don't understand what is going on here. I have found this excerpt from C FAQ:
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C syntax that lets you operate on the reference itself separate from the object to which it refers.
This creates some questions for me. So, if reference is the object itself and I create a new object in the same memory address, does this mean that the reference "becomes" the new object? In the example above, vectors are linear arrays; so, as long as the array points to the same memory range, the object will be valid. However, this becomes a lot trickier when other data sets are being used (e.g sets, maps, linked lists) because each "node" typically points to different parts of memory.
Should I treat references as undefined if the original object is destroyed? If yes, is there a way to identify that the reference is destroyed other than a custom mechanism that tracks the references?
Note: Tested this with GCC, LLVM, and MSVC
CodePudding user response:
The note is misleading, treating references as syntax sugar for pointers is fine as a mental model. In all the ways a pointer might dangle, a reference will also dangle. Accessing dangling pointers/references is undefined behaviour (UB).
int* p = new int{42};
int& i = *p;
delete p;
void f(int);
f(*p); // UB
f(i); // UB, with the exact same reason
This also extends to the standard containers and their rules about pointer/reference invalidation. The reason any surprising behaviour happens in your example is simply UB.