Home > Net >  Expression type vs. expression value category vs. object type and when is which used?
Expression type vs. expression value category vs. object type and when is which used?

Time:10-14

There are three somewhat different concepts in c :

Value category of an expression

  • lvalue, rvalue, etc.
  • I'm not able to formalize this, but roughly said, it's something that can often be relatively easy to deduce in a purely grammatical way. For example with "1.0" or "n" you often don't need any sort of knowledge of the context - you simply look at the actual symbols of the expression and can deduce the expression.
  • There are exceptions to this however - with f(x) for example we need to know whether f is a function or a type, and we also need to know whether it returns a value/reference type.

Type of an expression

  • An intuitive explanation of this would be "if this expression were to be evaluated, it would have some value. What would be the type of this value ?". Because it has to do with values, there can not be any sort of reference type - the type of an expression is a non reference type.
  • Unlike the above, this can easily require some sort of knowledge of the context - to deduce the type of "n" you need a symbol table that stores the type of "n".

Type of an object

  • This seems a bit more complicated, as an object is something that really only exists in the abstract, at runtime. While it's not something directly understandable as something in source code, regardless of the philosophical difference, we can say that whatever the object is, the type of the object is induced by a declaration expression
  • e.g. for int& x the object that the identifier x is associated with, is of type int&

//

1, Are the descriptions above correct in the sense that they at least roughly match how these concepts are described in the standard?

2, Assuming they are, how and in what situations are these properties used?

As an example, to resolve overloading, it seems to me that only the expression properties can be used for example - the relevant property of whether the object type is a reference or not is encapsulated in the expression value category to some extent, so we don't strictly need to know whether the return object is an lvalue reference type or not - the case when it is, is already contained in the definition of lvalue.

What are some examples of when the object type itself is used, rather than just the expression type?

I am mostly interested in this from the perspective of the standard, but a description on how it might differ from the compiler point of view will be also appreciated.

CodePudding user response:

Value category of an expression

Type of an expression

Both of these require knowing not only the grammar of the expression used, which itself requires knowing whether names are types/templates (C 's cannot be parsed without doing name lookup and understanding declarations), but also require knowing the value category and types of the operands of the expression. That's not only true for postfix expressions. It also applies e.g. to member access.


Type of an object

This is independent of the types and value categories of expressions as you said. All compile-time properties, such as overload resolution, are determined completely by value category and type of expressions. The result of an expression when evaluated (at runtime) may refer to some object (if it is a glvalue expression). The type of this object is not necessarily the same as the type of the expression and can vary between multiple evaluations of the same expression.

Usually the result of a glvalue expression of a given type is supposed to refer to an object of the same (up to qualifiers) type, but that is not generally guaranteed. Similarly a pointer value of a give pointed-to-type should be a pointer to an object of that same type (up to qualifiers), but that is not guaranteed. Basically the only way to violate these is however to use potentially dangerous casts such as reinterpret_cast.

A declaration int& x = /*...*/; doesn't declare or create any object. A reference is not considered an object. x will refer to the object which is the result of the right-hand side. As above, it may not necessarily have the type int. For example:

alignas(int) char y;
int& x = reinterpret_cast<int&>(y);

now the name x used in an expression has type int and value category lvalue, but the result of the lvalue expression refers to an object of type char (and access through this lvalue is not allowed because it would be an aliasing violation).


Objects are also not only created by declarations. As you said they are a runtime property that can only be talked about when considering a particular state of program execution. Other than from (non-reference) variables, objects can for example be created explicitly as temporary objects (via temporary materialization conversion from prvalues to xvalues), by new expressions or implicitly by certain operations which are defined to do so (e.g. a call to std::malloc under certain (strict) conditions).

It is also possible to reuse storage of objects, even from declared variables, or to nest objects in certain other objects. For example

static_assert(sizeof(float) == sizeof(int));

alignas(float) int x;
new(&x) float;

Now if the assertion succeeds after the new expression the int object from the declaration is not alive anymore and trying to read/write through x with

x = 1;

would have undefined behavior as x refers to an object outside its lifetime, but e.g.

*std::launder(reinterpret_cast<float*>(&x)) = 1;

will be fine after the new expression (but not before), because reinterpret_cast<float*>(&x) is a prvalue expression of type float* with a pointer value pointing to the out-of-lifetime int object, while the additional std::launder call adjusts the pointer value to point to the float object at the same storage location which is inside its lifetime.

(However you generally have to make sure that once the lifetime of the declared object would end normally, e.g. at the end of its scope, that there actually is an object of the correct type alive (and transparently-replacable with the original object) at the storage location. Otherwise you will have undefined behavior. Also, there is usually no practical reason for an example as above. One can simply declare two variables instead and have the compiler figure out whether the same storage can be reused.)


There is also the related concept of dynamic type of an expression, which however is slightly different from all of the above. Consider for example

struct A {
    virtual int f() { return 1; } 
};
struct B : A {
    int f() override { return 2; } 
};

B b;
A& a = b;
int x = a.f();

Here a in the last expression is an lvalue of type A and the result of the expression also refers to an object of type A, specifically the A base subobject of b, but the dynamic type of the expression for this specific evaluation is B, the most-derived type of the object to which the result of the expression a refers. This concept is used to determine which function override should be called in virtual dispatch but does not affect the (static) type or value category of the call expression a.f(). Using A& a = reinterpret_cast<A&>(b); instead would change the type of the result of the expression a to B and cause undefined behavior.

  • Related