Home > front end >  Using memcpy to switch active member of union in C
Using memcpy to switch active member of union in C

Time:09-03

I know about the memcpy/memmove to a union member, does this set the 'active' member? question , but I guess my question is different. So:

Suppose sizeof( int ) == sizeof( float ) and I have the following code snippet:

union U{
    int i;
    float f;
};

U u;
u.i = 1; //i is the active member of u

::std::memcpy( &u.f, &u.i, sizeof( u ) ); //copy memory content of u.i to u.f

My questions:

  1. Does the code lead to an undefined behaviour (UB)? If yes why?
  2. If the code does not lead to an UB, what is the active member of u after the memcpy call and why?
  3. What would be the answer to previous two questions if sizeof( int ) != sizeof( float ) and why?

CodePudding user response:

You are not allowed to use memcpy to copy overlapping regions of memory:

If the objects overlap, the behavior is undefined.

Your code has undefined behavior because of this violation of memcpy's precondition, as u.f and u.i occupy the same address in memory.

CodePudding user response:

Regardless of the union, the behaviour of std::memcpy is undefined if the source and destination overlap. This is the case for every member of the union, and it would not be different if the sizes weren't the same.

If you were to use std::memmove instead, there is no longer an issue due to the overlap, and it also doesn't matter that you copy from a member of a union. Since both types are trivially copyable, the behaviour is defined and u.f becomes the active member of the union, but the union holds the same bytes as before in practice.

The only issue would arise if sizeof(U) was larger than sizeof(int), because you would be copying potentially uninitialized bytes. This is undefined behaviour.

CodePudding user response:

You can use std::bit_cast to switch the active members in a union. Additionally, it's constexpr so you can even use the union in a core constant calculation.

#include <memory>
#include <bit>

union U {
    int i;
    float f;
    constexpr void switch_to_int() { this->i = std::bit_cast<int>(f); }
    constexpr void switch_to_float() { this->f = std::bit_cast<float>(i); }
};

constexpr int foo() {
    U u{};
    u.f = 2.0f;
    u.switch_to_int();
    return  u.i;
}

int main()
{
    constexpr int i = foo();
}

Compiler Explorer

std::bit_cast uses memcpy under the hood and compilers do a great job of optimizing the code. In this case memcpy is used to create an r-value that is then written to the new, active union element. This is what's left of the call to foo() at runtime.

mov eax,40000000h

CodePudding user response:

  1. Yes, it's undefined behaviour (UB). Because &u.f and &u.i points to the same start address. See the definition of memcpy:
void *   memcpy (void *__restrict, const void *__restrict, size_t);

The C99 keyword restrict is an indication to the compiler that different object pointer types and function parameter arrays do not point to overlapping regions of memory.

This enables the compiler to perform optimizations that might otherwise be prevented because of possible aliasing.

It is your responsibility to ensure that restrict-qualified pointers do not point to overlapping regions of memory.

__restrict, permitted in C90 and C , is a synonym for restrict.

  1. Because it is UB. The result are undefined.

  2. Also UB. Because &u.f and &u.i always points to the same start address, regardless equality of their lengths. sizeof(U) will get the maximum size of all the members of the union.

  • Related