Type-pun uint64_t as two uint32_t in C 20-CodePudding

This code to read a uint64_t as two uint32_t is UB due to the strict aliasing rule:

uint64_t v;
uint32_t lower = reinterpret_cast<uint32_t*>(&v)[0];
uint32_t upper = reinterpret_cast<uint32_t*>(&v)[1];

Likewise, this code to write the upper and lower part of an uint64_t is UB due to the same reason:

uint64_t v;
uint32_t* lower = reinterpret_cast<uint32_t*>(&v);
uint32_t* upper = reinterpret_cast<uint32_t*>(&v)   1;

*lower = 1;
*upper = 1;

How can one write this code in a safe and clean way in modern C 20, potentially using std::bit_cast?

CodePudding user response：

Using std::bit_cast:

Try it online!

#include <bit>
#include <array>
#include <cstdint>
#include <iostream>

int main() {
    uint64_t x = 0x12345678'87654321ULL;
    // Convert one u64 -> two u32
    auto v = std::bit_cast<std::array<uint32_t, 2>>(x);
    std::cout << std::hex << v[0] << " " << v[1] << std::endl;
    // Convert two u32 -> one u64
    auto y = std::bit_cast<uint64_t>(v);
    std::cout << std::hex << y << std::endl;
}

Output:

87654321 12345678
1234567887654321

std::bit_cast is available only in C 20. Prior to C 20 you can manually implement std::bit_cast through std::memcpy, with one exception that such implementation is not constexpr like C 20 variant:

template <class To, class From>
inline To bit_cast(From const & src) noexcept {
    //return std::bit_cast<To>(src);
    static_assert(std::is_trivially_constructible_v<To>,
        "Destination type should be trivially constructible");
    To dst;
    std::memcpy(&dst, &src, sizeof(To));
    return dst;
}

For this specific case of integers quite optimal would be just to do bit shift/or arithmetics to convert one u64 to two u32 and back again. std::bit_cast is more generic, supporting any trivially constructible type, although std::bit_cast solution should be same optimal as bit arithmetics on modern compilers with high level of optimization.

One extra profit of bit arithmetics is that it handles correctly endianess, unlike std::bit_cast.

Try it online!

#include <cstdint>
#include <iostream>

int main() {
    uint64_t x = 0x12345678'87654321ULL;
    // Convert one u64 -> two u32
    uint32_t lo = uint32_t(x), hi = uint32_t(x >> 32);
    std::cout << std::hex << lo << " " << hi << std::endl;
    // Convert two u32 -> one u64
    uint64_t y = (uint64_t(hi) << 32) | lo;
    std::cout << std::hex << y << std::endl;
}

Output:

87654321 12345678
123456788765432

CodePudding user response：

in a safe and clean way

Do not use reinterpret_cast. Do not depend on unclear code that depends on some specific compiler settings and fishy, uncertain behavior. Use exact arithmetic operations with well-known defined result. Classes and operator overloads are all there waiting for you. For example, some global functions:

#include <iostream>

struct UpperUint64Ref {
   uint64_t &v;
   UpperUint64Ref(uint64_t &v) : v(v) {}
   UpperUint64Ref operator=(uint32_t a) {
      v &= 0x00000000ffffffffull;
      v |= (uint64_t)a << 32;
      return *this;
   }
   operator uint64_t() {
      return v;
   }
};
struct LowerUint64Ref { 
    uint64_t &v;
    LowerUint64Ref(uint64_t &v) : v(v) {}
    /* as above */
};
UpperUint64Ref upper(uint64_t& v) { return v; }
LowerUint64Ref lower(uint64_t& v) { return v; }

int main() {
   uint64_t v;
   upper(v) = 1;
}

Or interface object:

#include <iostream>

struct Uint64Ref {
   uint64_t &v;
   Uint64Ref(uint64_t &v) : v(v) {}
   struct UpperReference {
       uint64_t &v;
       UpperReference(uint64_t &v) : v(v) {}
       UpperReference operator=(uint32_t a) {
           v &= 0x00000000ffffffffull;
           v |= (uint64_t)a << 32u;
       }
   };
   UpperReference upper() {
      return v;
   }
   struct LowerReference {
       uint64_t &v;
       LowerReference(uint64_t &v) : v(v) {}
   };
   LowerReference lower() { return v; }
};
int main() {
   uint64_t v;
   Uint64Ref r{v};
   r.upper() = 1;
}

CodePudding user response：

Using std::memcpy

#include <cstdint>
#include <cstring>

void foo(uint64_t& v, uint32_t low_val, uint32_t high_val) {
    std::memcpy(reinterpret_cast<unsigned char*>(&v), &low_val,
                sizeof(low_val));
    std::memcpy(reinterpret_cast<unsigned char*>(&v)   sizeof(low_val),
                &high_val, sizeof(high_val));
}

int main() {
    uint64_t v = 0;
    foo(v, 1, 2);
}

With O1, the compiler reduces foo to:

        mov     DWORD PTR [rdi], esi
        mov     DWORD PTR [rdi 4], edx
        ret

Meaning there are no extra copies made, std::memcpy just serves as a hint to the compiler.