consider a page that was read from somewhere. I want to compare a value at an offset in that page to a 4 bytes integer. Is the following code going to work on ARM as well as on X86?
bool equal(char *page, uint32 offset, uint32 data)
{
return *(uint32 *) (page offset) == data;
}
Or do I need to memcpy 4 bytes from offset to a uint32 variable first?
CodePudding user response:
As you suspect, it is not safe to access data via a potentially unaligned pointer. It has undefined behavior and depending on the target CPU, it may work as expected (x86) or throw a Bus error (most other CPUs).
You can make your code portable using memcpy
or memcmp
as follows:
#include <stdbool.h>
#include <stdint.h>
#include <string.h>
bool equal(const char *page, size_t offset, uint32_t data) {
uint32_t v;
memcpy(&v, page offset, sizeof v);
return v == data;
}
bool equal2(const char *page, size_t offset, uint32_t data) {
return !memcmp(page offset, &data, sizeof data);
}
It is recommended to write portable code using the appropriate types (size_t
for offset improves code generation) and let the compile do the optimisation.
As can bee verified on Godbolt's compiler explorer, the above code does compile to optimal code for equal
on both x86 and ARM targets, using the appropriate load instructions.
The alternative using memcmp
suggested in comments seems more complicated for both gcc
and clang
to optimize fully. Also note that the memcmp
approach would not work in all cases for other scalar types: for example IEEE float
and double
types have 2 distinct representations for positive and negative zero, which would compare equal for ==
but not for memcmp
in a version of the above code for float
or double
typed data
. Other scalar types in exotic architectures might also have multiple representations or even have padding bits with the same shortcomings. Using memcpy
seems preferable.
CodePudding user response:
No, it's not "safe". An UltraSPARC CPU will send you a "Bus Error" when trying this unaligned, for example. And you didn't took care about endianness, too - it MAY be a problem too, depending from where the data come from.
You should guard your code with a generic, "slow" decoder (byte per byte, with shifting and endianness-aware code), and then a "fast" decoder enabled ONLY for CPU you know for accepting such accesses.
For example, for MSVC, a x86 compilation will have the _M_IX86
macro defined, so you can do something like:
// Default: do not allow unaligned access
#undef USE_UNALIGNED
#ifdef _MSC_VER
#ifdef _M_IX86
// For this compiler CPU, OK, using unaligned is allowed.
#define USE_UNALIGNED
#endif
#endif
....
bool equal(char *page, uint32 offset, uint32 data)
{
#ifdef USE_UNALIGNED
// Fast / unaligned method.
return *(uint32 *) (page offset) == data;
#else
// Slow/byte per byte method.
#endif
}