Home > other >  reinterpret_cast a slice of byte array?
reinterpret_cast a slice of byte array?

Time:09-07

If there is a buffer that is supposed to pack 3 integer values, and you want to increment the one in the middle, the following code works as expected:

#include <iostream>
#include <cstring>

int main()
{
    char buffer[] = {'\0','\0','\0','\0','A','\0','\0','\0','\0','\0','\0','\0'};
    
    int tmp;

    memcpy(&tmp, buffer   4, 4); // unpack buffer[5:8] to tmp
    std::cout<<buffer[4];              // prints A

    tmp  ;
    memcpy(buffer   4, &tmp, 4); // pack tmp value back to buffer[5:8]
    std::cout<<buffer[4];              // prints B

    return 0;
}

To me this looks like too many operations are taking place for a simple action of merely modifying some data in a buffer array, i.e. pushing a new variable to the stack, copying the specific region from the buffer to that var, incrementing it, then copying it back to the buffer.

I was wondering whether it's possible to cast the 5:8 range from the byte array to an int* variable and increment it, for example:

  int *tmp = reinterpret_cast < int *>(buffer[5:8]);
  (*tmp)  ;

It's more efficient this way, no need for the 2 memcpy calls.

CodePudding user response:

The latter approach is technically undefined, though it's likely to work on any sane implementation. Your syntax is slightly off, but something like this will probably work:

int* tmp = reinterpret_cast<int*>(buffer   4);
(*tmp)  ;

The problem is that it runs afoul of C 's strict aliasing rules. Essentially, you're allowed to treat any object as an array of char, but you're not allowed to treat an array of char as anything else. Thus to be fully compliant you need to take the approach you did in the first snippet: treat an int as an array of char (which is allowed) and copy the bytes from the array into it, manipulate it as desired, and then copy back.


I would note that if you're concerned with runtime efficiency, you probably shouldn't be. Compilers are very good at optimizing these sorts of things, and will likely end up just manipulating the bytes in place. For instance, clang with -O2 compiles your first snippet (with std::cout replaced with printf to avoid stream I/O overhead) down to:

mov     edi, 65
call    putchar
mov     edi, 66
call    putchar

Demo

Remember, when writing C you are describing the behavior of the program you want the compiler to write, not writing the instructions the machine will execute.

CodePudding user response:

Simply change buffer[5:8] to buffer 4, just like in your memcpy() calls, and then it will likely work the way you want:

int *tmp = reinterpret_cast<int*>(buffer   4 /* or: &buffer[4] */);
(*tmp)  ;

Alternatively, you can use a reference instead of a pointer:

int &tmp = reinterpret_cast<int&>(buffer[4] /* or: *(buffer 4) */);
tmp  ;

However, note that either approach is technically undefined behavior, as accessing the array like this violates the Strict Aliasing rules. The memcpy() approach is the safe and standard way to go, and compilers are very good about optimizing memcpy() calls.

But, the reinterpret_cast approach will likely work nonetheless, depending on your compiler.

  • Related