Home > Enterprise >  Do ARM-CPUs need special pointer-decoration for unaligned accesses?
Do ARM-CPUs need special pointer-decoration for unaligned accesses?

Time:03-19

Do ARM-CPUs that support unaligned memory accesses need special pointer-decoration for unaligned accesses in C / C ? Or can I use every pointer for unaligned accesses ? Or is this compiler-dependent ?

CodePudding user response:

In short, this is compiler-dependent since it is not covered by the C standard.

However, as noted in the comments some ARM instructions require an aligned pointer and any ARM compiler would need to implement some alignment strategy. Since ARM processors work much more efficiently with aligned access it is likely that the compiler will normally ensure that data is aligned.

It is also likely that the compiler provides ways of working with non-aligned data (of course, this would again be compiler-defined behavior implementing what would be undefined behavior in the C standard). Common examples are packed structures and casting of pointers.

Let us look at a few cases:

    __packed struct
    {
        char a;
        int i;
    } s;

In this case, &s.i is likely to be an unaligned pointer which is fine because the compiler knows that and can generate code accordingly.

    char buffer[80];

    void decode(int *i)
    {
        int n = i[0];
        ...
    }

In this case buffer may not be aligned (as an array of chars, there is no need), however, if the compiler normally aligns ints, then the compiler will assume that the pointer *i in decode is aligned and may generate code based on that assumption.

In that case, calling decode((int *)buffer) could lead to a hard fault in the processor.

Hence, the longer answer is that (at least in the cases I know of) there is no visible "decoration" for aligned/unaligned pointers, but the compiler may make assumptions based on the type and origin of the pointer and thus have a kind of internal "decoration" of pointers. In that case it is important to avoid "cheating" the compiler into making a wrong assumption.

CodePudding user response:

A standard conform C/C program can not have unaligned pointer access as you can not legally form such a pointer. The compiler guarantees that with appropriate padding in structures and malloc/new returning suitably aligned memory blocks.

But all compilers take this a bit looser and you can create an unaligned pointer by casting e.g. a char* to int* when the value is not aligned. This is implementation defined behavior so you are already on shaky ground.

Worse (for you) is that on ARM the CPU doesn't like unaligned access and has a flag that will make any attempt to access memory unaligned cause a CPU exception. Not every OS sets this flag to fault but you can't assume it is not set. The next OS update might set the flag. The compiler and you have to generate code that only uses aligned access.

Now sometimes you do have data that is not aligned and there are basically only 2 ways to access it safely:

char *buf = ....;
uint32_t t;
memcpy(t,&buf[123], sizeof(t));

or

struct DiskLayout [[gcc::packed]] { // replace with your compilers "packed" attribute
    ...
    uint32_t magic;
    ...
};
struct DiskLayout disk;
read_from_disk(&disk);
uint32_t magic = disk.magic;

In the first case the memcpy() call will check the alignment at runtime and copy accordingly.

In the second case the packed attribute forces the compiler to not add any padding. It will also force the alignment of the structure to 1. So disk.magic will be 1 byte aligned and the compiler has to generate code accordingly. Which means it has to read 4 individual bytes and combined them back into a 32bit value. As you can imagine this is much much slower than a single 32bit read. Similar on a write the compiler has to split the value and write 4 individual bytes.

So the basic rule for unaligned access is: Always work on aligned copies of the data. Only ever read or write the value once.

If you want to use the packed attribute have a packed struct and a normal struct. Copy from the packed struct to the normal one, work on it and then copy it back. Don't work on the packed struct, the code will be much much slower.

  • Related