Suppose I have something along these lines:
class Base {
public:
Base(int value) : value_(value) {}
int getValue() const { return value_; }
private:
int value_;
};
class Derived : public Base {
public:
// Derived only has non-virtual functions. No added data members.
int getValueSquared() const { return value_ * value_; }
}
And I do the following:
Base* base = new Base(42);
Derived* derived = static_cast<Derived*>(base);
std::cout << derived->getValueSquared() << std::endl;
Strictly speaking, this is UB. Practically speaking, it works just fine.
Actual data members from Base (e.g., int value_
) have to be located at the same offsets whether the object is an actual Base
or an actual Derived
(otherwise, good luck upcasting). And getValueSquared()
isn't part of the actual memory footprint of a Derived
instance so it's not like it will be "missing" or unconstructed from the in-memory Base object.
I know that UB is all the reason I need not to do this but, logically, it seems it would always work. So, why not?
I am asking because it seems like an interesting quirk to discuss...not because I plan on using it in production.
CodePudding user response:
In practice, most compilers will convert a non-virtual member function into a static function with a hidden this
parameter. As long as the function doesn't use any data members that aren't part of the base class, it will probably work.
The problem with UB is that you can't predict it. Something that worked yesterday can fail today, with no rhyme or reason behind it. The compiler is given a lot of latitude on how to interpret anything that's technically undefined, and the race to find better optimizations means that unexpected changes can happen suddenly. Murphy's law says that these changes will be most evident when you're demoing the software to your most important boss or biggest customer.
CodePudding user response:
It is UB indeed. Aliasing rules break:
#include <iostream>
struct Base {
int value = 0;
};
struct Derived1 : Base {
void inc10() { value = 10; }
};
struct Derived2 : Base {
void inc20() { value = 20; }
};
void doit(Derived1 *d1, Derived2 *d2) {
std::cout << (&d1->value == &d2->value) << "\n";
d1->inc10();
d2->inc20();
std::cout << d1->value << " " << d2->value << "\n";
}
int main() {
Base b;
doit(static_cast<Derived1*>(&b), static_cast<Derived2*>(&b));
std::cout << b.value << "\n";
}
My GCC 11.2.0, when compiled with g a.cpp -O2 -o a
, prints
1
10 30
30
It is free to assume that Derived1*
and Derived2*
point to different objects, hence it optimizes away extra reading of d1->value
after calling d2->inc()
because the latter cannot affect the former.
Clang 13.0.0 does not exhibit such behavior, and it's fine too.