When to use reinterpret_cast without disobeying the strict aliasing rule?-CodePudding

For a long time I've used reinterpret_cast like this:

static_assert(sizeof(int) == sizeof(float));
int a = 1;
float b = *reinterpret_cast<float*>(&a); // viewed as float in binary.

However, when I reviewed type conversion in C recently, I found that it's UB when the strict aliasing rule is considered! Dereferencing a different type(or to be exact, non-similar type) of pointer to the initial data may be optimized. (I've known that std::bit_cast in C 20 is an alternative way.)

From my perspective, reinterpret_cast is mainly used for pointer type conversion. Nevertheless, how can I use another type of pointer considering dereferencing is prohibited?

I've browsed when to use reinterpret_cast, and an answer is quoted below :

One case when reinterpret_cast is necessary is when interfacing with opaque data types. This occurs frequently in vendor APIs over which the programmer has no control. Here's a contrived example where a vendor provides an API for storing and retrieving arbitrary global data.

But when it comes to the implementation of these APIs, if the data should be used there, it's hard to avoid dereferencing the pointer.

It seems that only converting to another pointer type and back to the initial pointer type is reasonable (But what's the point? Why don't you use the initial type or define a template directly?). So my question is: When to use reinterpret_cast without disobeying that rule? Is there any general usage?

I'm really a novice in C , so any advice would be appreciated.

CodePudding user response：

When you use a reinterpret_cast in your code, your are telling the compiler: "I know what I'm doing – just implement the cast and trust me that the result will be OK to use." The compiler will then use the result of that cast as it would any other object of the specified destination type.

So, if you know that a particular value is the address of an actual float data type, then you can safely cast (say) an intptr_t value to a float* and dereference the resulting pointer.

A common case for such use of the reinterprt_cast occurs in Windows (WinAPI) programming: the LPARAM of a Windows message is often used to point to a particular type of data structure. Here is a 'pseudo-example' of a handler for the WM_NOTIFY message:

BOOL OnNotify(WPARAM wParam, LPARAM lParam, LRESULT *pResult)
{
    NMHDR *pHdr = reinterpret_cast<NMHDR *>(lParam);
    switch (pHdr->code) {
        // ... do some stuff with the data in the referenced NMHDR structure
    }
    //...
    *pResult = 0;
    return TRUE;
}

In this case, the Windows framework has 'promised' that the lParam argument received will be the address of a NMHDR structure, so the cast is safe. (Note that other C casts won't work for such a conversion: only reinterpret_cast or a "C-Style" cast can convert the integer type to a pointer.)

However, using any sort of cast (reinterpret_cast or C-style, especially) for so-called type punning is never a good idea, because of the strict aliasing rules you have mentioned in your question. To copy the bits ("as-is") from an int to a float (assuming those types are the same size, and that you don't have access to the C 20 std::bit_cast), you should use std::memcpy, instead of dereferencing aliased pointers.

CodePudding user response：

It simply comes down to this: Don't lie to your compiler.

The point of reinterpret_cast is spelled out right there: when interfacing with opaque data types and vendor APIs over which the programmer has no control. I.e. to fix other peoples broken code.

You are correct that you should be using the correct type from the start. If there is more than one possible type to support then a template is often the solution. Templates bring with them their own problem, like having to define everything in headers, code duplication in the binary and not being able to mix values of different types. Frequently the code does not care what type is passed through some API as it never touches the value itself, only passes it back to some callback or so. For that you want a signature that isn't dependent on the user provided type.

C 17 added std::variant and for the most unconstrained case std::any. std::variant is a union on steroids and lets you pass one of many values. std::any is the type save version of your reinterpret_cast allowing you to pass any type and later get the value back provided you know what type it is.

Use those options for APIs when you need to handle user provided types.

Use std::bit_cast when you need to hack around in the bit patterns of types, like accessing a float as int to do some magic. Use that on single values and you have no aliasing problems even with the old style reinterpret_cast way or using unions. You are in implementation defined territories with those and you might as well take full advantage of that. As long as you don't pass the reinterpreted pointers to other functions the compiler is smart enough to see what you are doing.