Home > Blockchain >  Which types on a 64-bit computer are naturally atomic in gnu C and gnu C ?--meaning they have atomi
Which types on a 64-bit computer are naturally atomic in gnu C and gnu C ?--meaning they have atomi

Time:04-15

NB: For this question, I'm not talking about the C or C language standards. Rather, I'm talking about gcc compiler implementations for a particular architecture, as the only guarantees for atomicity by the language standards are to use _Atomic types in C11 or later or std::atomic<> types in C 11 or later. See also my updates at the bottom of this question.

On any architecture, some data types can be read atomically, and written atomically, while others will take multiple clock cycles and can be interrupted in the middle of the operation, causing corruption if that data is being shared across threads.

On 8-bit single-core AVR microcontrollers (ex: the ATmega328 mcu, used by the Arduino Uno, Nano, or Mini), only 8-bit data types have atomic reads and writes (with the gcc compiler and gnu C or gnu C language). I had a 25-hr debugging marathon in < 2 days and then wrote this answer here. See also the bottom of this question for more info. and documentation on 8-bit variables having naturally atomic writes and naturally atomic reads for AVR 8-bit microcontrollers when compiled with the gcc compiler which uses the AVR-libc library.

On (32-bit) STM32 single-core microcontrollers, any data type 32-bits or smaller is definitively automatically atomic (when compiled with the gcc compiler and the gnu C or gnu C language, as ISO C and C make no guarantees of this until the 2011 versions with _Atomic types in C11 and std::atomic<> types in C 11). That includes bool/_Bool, int8_t/uint8_t, int16_t/uint16_t, int32_t/uint32_t, float, and all pointers. The only not atomic types are int64_t/uint64_t, double (8 bytes), and long double (also 8 bytes). I wrote about that here:

  1. Which variable types/sizes are atomic on STM32 microcontrollers?
  2. Reading a 64 bit variable that is updated by an ISR
  3. What are the various ways to disable and re-enable interrupts in STM32 microcontrollers in order to implement atomic access guards?

Now I need to know for my 64-bit Linux computer. Which types are definitively automatically atomic?

My computer has an x86-64 processor, and Linux Ubuntu OS.

I am okay using Linux headers and gcc extensions.

I see a couple of interesting things in the gcc source code indicating that at least the 32-bit int type is atomic. Ex: the Gnu header <bits/atomic_word.h>, which is stored at /usr/include/x86_64-linux-gnu/c /8/bits/atomic_word.h on my computer, and is here online, contains this:

typedef int _Atomic_word;

So, int is clearly atomic.

And the Gnu header <bits/types.h>, included by <ext/atomicity.h>, and stored at /usr/include/x86_64-linux-gnu/bits/types.h on my computer, contains this:

/* C99: An integer type that can be accessed as an atomic entity,
   even in the presence of asynchronous interrupts.
   It is not currently necessary for this to be machine-specific.  */
typedef int __sig_atomic_t;

So, again, int is clearly atomic.

Here is some sample code to show what I am talking about...

...when I say that I want to know which types have naturally atomic reads, and naturally atomic writes, but not atomic increment, decrement, or compound assignment.

volatile bool shared_bool;
volatile uint8_t shared u8;
volatile uint16_t shared_u16;
volatile uint32_t shared_u32;
volatile uint64_t shared_u64;
volatile float shared_f; // 32-bits
volatile double shared_d; // 64-bits

// Task (thread) 1
while (true)
{
    // Write to the values in this thread.
    //
    // What I write to each variable will vary. Since other threads are reading
    // these values, I need to ensure my *writes* are atomic, or else I must
    // use a mutex to prevent another thread from reading a variable in the
    // middle of this thread's writing.
    shared_bool = true;
    shared_u8 = 129;
    shared_u16 = 10108;
    shared_u32 = 130890;
    shared_f = 1083.108;
    shared_d = 382.10830;
}

// Task (thread) 2
while (true)
{
    // Read from the values in this thread.
    //
    // What thread 1 writes into these values can change at any time, so I need
    // to ensure my *reads* are atomic, or else I'll need to use a mutex to
    // prevent the other thread from writing to a variable in the midst of
    // reading it in this thread.
    if (shared_bool == whatever)
    {
        // do something
    }
    if (shared_u8 == whatever)
    {
        // do something
    }
    if (shared_u16 == whatever)
    {
        // do something
    }
    if (shared_u32 == whatever)
    {
        // do something
    }
    if (shared_u64 == whatever)
    {
        // do something
    }
    if (shared_f == whatever)
    {
        // do something
    }
    if (shared_d == whatever)
    {
        // do something
    }
}

C _Atomic types and C std::atomic<> types

I know C11 and later offers _Atomic types, such as this:

const _Atomic int32_t i;
// or (same thing)
const atomic_int_least32_t i;

See here:

  1. https://en.cppreference.com/w/c/thread
  2. https://en.cppreference.com/w/c/language/atomic

And C 11 and later offers std::atomic<> types, such as this:

const std::atomic<int32_t> i;
// or (same thing)
const atomic_int32_t i;

See here:

  1. https://en.cppreference.com/w/cpp/atomic/atomic

And these C11 and C 11 "atomic" types offer atomic reads and atomic writes as well as atomic increment operator, decrement operator, and compound assignment...

...but that's not really what I'm talking about.

I want to know which types have naturally atomic reads and naturally atomic writes only. For what I am talking about, increment, decrement, and compound assignment will not be naturally atomic.


Update 14 Apr. 2022

I had some chats with someone from ST, and it seems the STM32 microcontrollers only guarantee atomic reads and writes for variables of certain sizes under these conditions:

  1. You use assembly.
  2. You use the C11 _Atomic types or the C 11 std::atomic<> types.
  3. You use the gcc compiler with gnu language and gcc extensions.
    1. I'm most interested in this last one, since that's what the crux of my assumptions at the top of this question seem to have been based on for the last 10 years, without me realizing it. I'd like help finding the gcc compiler manual and the places in it where it explains these atomic access guarantees that apparently exist. We should check the:
      1. AVR gcc compiler manual for 8-bit AVR ATmega microcontrollers.
      2. STM32 gcc compiler manual for 32-bit ST microcontrollers.
      3. x86-64 gcc compiler manual??--if such a thing exists, for my 64-bit Ubuntu computer.

My research thus far:

  1. AVR gcc: no avr gcc compiler manual exists. Rather, use the AVR-libc manual here: https://www.nongnu.org/avr-libc/ --> "Users Manual" links.

    1. The AVR-libc user manual in the <util/atomic> section backs up my claim that 8-bit types on AVR, when compiled by gcc, already have naturally atomic reads and naturally atomic writes when it implies that 8-bit reads and writes are already atomic by saying (emphasis added):

    A typical example that requires atomic access is a 16 (or more) bit variable that is shared between the main execution path and an ISR.

    1. It is talking about C code, not assembly, as all examples it gives on that page are in C, including the one for the volatile uint16_t ctr variable, immediately following that quote.

CodePudding user response:

The answer from the point of view of the language standard is very simple: none of them are "definitively automatically" atomic.

First of all, it's important to distinguish between two senses of "atomic".

  • One is atomic with respect to signals. This ensures, for instance, that when you do x = 5 on a sig_atomic_t, then a signal handler invoked in the current thread will see either the old or new value. This is usually accomplished simply by doing the access in one instruction, since signals can only be triggered by hardware interrupts, which can only arrive between instructions. For instance, x86 add dword ptr [var], 12345, even without a lock prefix, is atomic in this sense.

  • The other is atomic with respect to threads, so that another thread accessing the object concurrently will see a correct value. This is more difficult to get right. In particular, ordinary variables of type sig_atomic_t are not atomic with respect to threads. You need _Atomic or std::atomic to get that.

Note well that the internal names your implementation chooses for its types are not evidence of anything. From typedef int _Atomic_word; I would certainly not infer that "int is clearly atomic"; I don't know in what sense the implementers were using the word "atomic", or whether it's accurate (could be used by legacy code, for instance). If they wanted to make such a promise it would be in the documentation, not in an unexplained typedef in a bits header that is never meant to be seen by the application programmer.


The fact that your hardware may make certain types of access "automatically atomic" does not tell you anything at the level of C/C . For instance, it is true on x86 that ordinary full-size loads and stores to naturally aligned variables are atomic. But in the absence of std::atomic, the compiler is under no obligation to emit ordinary full-size loads and stores; it is entitled to be clever and access those variables in other ways. It "knows" this will be no problem, because concurrent access would be a data race, and of course the programmer would never write code with a data race, would they?

As a concrete example, consider the following code:

unsigned x;

unsigned foo(void) {
    return (x >> 8) & 0xffff;
}

A load of a nice 32-bit integer variable, followed by some arithmetic. What could be more innocent? Yet check out the assembly emitted by GCC 11.2 -O2 try on godbolt:

foo:
        movzx   eax, WORD PTR x[rip 1]
        ret

Oh dear. A partial load, and unaligned to boot. AFAIK x86 provides no atomicity promises about unaligned loads.


Here is another fun example, this time on ARM64. Aligned 64-bit stores are atomic, per B2.2.1 of the ARMv8-A Architecture Reference Manual. So this looks fine:

unsigned long x;

void bar(void) {
    x = 0xdeadbeefdeadbeef;
}

But, GCC 11.2 -O2 gives (godbolt):

bar:
        adrp    x1, .LANCHOR0
        add     x2, x1, :lo12:.LANCHOR0
        mov     w0, 48879
        movk    w0, 0xdead, lsl 16
        str     w0, [x1, #:lo12:.LANCHOR0]
        str     w0, [x2, 4]
        ret

That's two 32-bit strs, not atomic in any way. A reader may very well read 0x00000000deadbeef.

Why do it this way? Materializing a 64-bit constant in a register takes several instructions on ARM64, with its fixed instruction size. But both halves of the value are equal, so why not materialize the 32-bit value and store it to each half?

(If you do unsigned long *p; *p = 0xdeadbeefdeadbeef then you get stp w1, w1, [x0] (godbolt). Which looks more promising as it is a single instruction, but in fact is still two separate writes for purposes of atomicity between threads.)


User supercat's answer to Are concurrent unordered writes with fencing to shared memory undefined behavior? has another nice example for ARM32 Thumb, where the C source asks for an unsigned short to be loaded once, but the generated code loads it twice. In the presence of concurrent writes, you could get an "impossible" result.

One can provoke the same on x86-64 (godbolt):

_Bool x, y, z;

void foo(void) {
    _Bool tmp = x;
    y = tmp;
    // imagine elaborate computation here that needs lots of registers
    z = tmp;
}

GCC will reload x instead of spilling tmp. On x86 you can load a global with just one instruction, but spilling to the stack would need at least two. So if x is being concurrently modified, either by threads or by signals/interrupts, then assert(y == z) afterwards could fail.


It really isn't safe to assume anything beyond what the languages actually guarantees, which is nothing unless you use std::atomic. Modern compilers know the exact limits of the language rules very well, and optimize aggressively. They can and will break code that assumes they will do what would be "natural", if that is outside the bounds of what the language promises, and they will very often do it in ways that one would never expect.

CodePudding user response:

On 8-bit AVR microcontrollers (ex: the ATmega328 mcu, used by the Arduino Uno or Mini), only 8-bit data types have atomic reads and writes.

Only in case you write your code in assembler, not in C.

On (32-bit) STM32 microcontrollers, any data type 32-bits or smaller is definitively automatically atomic.

Only in case you write your code in assembler, not in C. Additionally, only if the ISA guarantees that the generated instruction is atomic, I don't remember if this is true for all ARM instructions.

That includes bool/_Bool, int8_t/uint8_t, int16_t/uint16_t, int32_t/uint32_t, float, and all pointers.

No, that is definitely wrong.

Now I need to know for my 64-bit Linux computer. Which types are definitively automatically atomic?

The same types as in AVR and STM32: none.

This all boils down to that a variable access in C cannot be guaranteed to be atomic because it might get carried out in multiple instructions. Or in some cases in instructions for which the ISA doesn't guarantee atomicity.

The only types that can be regarded as atomic in C (and C ) are those with the _Atomic qualifier from C11/C 11. Period.

This answer of mine at EE here is a duplicate. It addresses the microcontroller cases explicitly, race conditions, use of volatile, dangerous optimizations etc. It also contains a simple way to protect from race conditions in interrupts which is applicable to all MCUs where interrupts cannot be interrupted. A quote from that answer:

When writing C, all communication between an ISR and the background program must be protected against race conditions. Always, every time, no exceptions. The size of the MCU data bus does not matter, because even if you do a single 8 bit copy in C, the language cannot guarantee atomicity of operations. Not unless you use the C11 feature _Atomic. If this feature isn't available, you must use some manner of semaphore or disable the interrupt during read etc. Inline assembler is another option. volatile does not guarantee atomicity.

  • Related