Home > Mobile >  Is it ok to access data using cast when I am not sure about alignment
Is it ok to access data using cast when I am not sure about alignment

Time:03-29

consider a page that was read from somewhere. I want to compare a value at an offset in that page to a 4 bytes integer. Is the following code going to work on ARM as well as on X86?

bool equal(char *page, uint32 offset, uint32 data)
{
    return *(uint32 *) (page offset) == data;
}

Or do I need to memcpy 4 bytes from offset to a uint32 variable first?

CodePudding user response:

As you suspect, it is not safe to access data via a potentially unaligned pointer. It has undefined behavior and depending on the target CPU, it may work as expected (x86) or throw a Bus error (most other CPUs).

You can make your code portable using memcpy or memcmp as follows:

#include <stdbool.h>
#include <stdint.h>
#include <string.h>

bool equal(const char *page, size_t offset, uint32_t data) {
    uint32_t v;
    memcpy(&v, page   offset, sizeof v);
    return v == data;
}

bool equal2(const char *page, size_t offset, uint32_t data) {
    return !memcmp(page   offset, &data, sizeof data);
}

It is recommended to write portable code using the appropriate types (size_t for offset improves code generation) and let the compile do the optimisation.

As can bee verified on Godbolt's compiler explorer, the above code does compile to optimal code for equal on both x86 and ARM targets, using the appropriate load instructions.

The alternative using memcmp suggested in comments seems more complicated for both gcc and clang to optimize fully. Also note that the memcmp approach would not work in all cases for other scalar types: for example IEEE float and double types have 2 distinct representations for positive and negative zero, which would compare equal for == but not for memcmp in a version of the above code for float or double typed data. Other scalar types in exotic architectures might also have multiple representations or even have padding bits with the same shortcomings. Using memcpy seems preferable.

CodePudding user response:

No, it's not "safe". An UltraSPARC CPU will send you a "Bus Error" when trying this unaligned, for example. And you didn't took care about endianness, too - it MAY be a problem too, depending from where the data come from.

You should guard your code with a generic, "slow" decoder (byte per byte, with shifting and endianness-aware code), and then a "fast" decoder enabled ONLY for CPU you know for accepting such accesses.

For example, for MSVC, a x86 compilation will have the _M_IX86 macro defined, so you can do something like:

// Default: do not allow unaligned access
#undef USE_UNALIGNED
#ifdef _MSC_VER
  #ifdef _M_IX86
    // For this compiler CPU, OK, using unaligned is allowed.
    #define USE_UNALIGNED
  #endif
#endif

....

bool equal(char *page, uint32 offset, uint32 data)
{
#ifdef USE_UNALIGNED
    // Fast / unaligned method.
    return *(uint32 *) (page offset) == data;
#else
    // Slow/byte per byte method.
#endif
}
  •  Tags:  
  • c arm
  • Related