Which types on a 64-bit computer are naturally atomic in C and C ?--meaning they have atomic reads,-CodePudding

On any architecture, some data types can be read atomically, and written atomically, while others will take multiple clock cycles and can be interrupted in the middle of the operation, causing corruption if that data is being shared across threads.

On 8-bit AVR microcontrollers (ex: the ATmega328 mcu, used by the Arduino Uno or Mini), only 8-bit data types have atomic reads and writes. I had a 25-hr debugging marathon in < 2 days and then wrote this answer here.

On (32-bit) STM32 microcontrollers, any data type 32-bits or smaller is definitively automatically atomic. That includes bool/_Bool, int8_t/uint8_t, int16_t/uint16_t, int32_t/uint32_t, float, and all pointers. The only not atomic types are int64_t/uint64_t, double (8 bytes), and long double (also 8 bytes). I wrote about that here:

Now I need to know for my 64-bit Linux computer. Which types are definitively automatically atomic?

My computer has an x86-64 processor, and Linux Ubuntu OS.

I am okay using Linux headers and gcc extensions.

I see a couple of interesting things in the gcc source code indicating that at least the 32-bit int type is atomic. Ex: the Gnu header <bits/atomic_word.h>, which is stored at /usr/include/x86_64-linux-gnu/c /8/bits/atomic_word.h on my computer, and is here online, contains this:

typedef int _Atomic_word;

So, int is clearly atomic.

And the Gnu header <bits/types.h>, included by <ext/atomicity.h>, and stored at /usr/include/x86_64-linux-gnu/bits/types.h on my computer, contains this:

/* C99: An integer type that can be accessed as an atomic entity,
   even in the presence of asynchronous interrupts.
   It is not currently necessary for this to be machine-specific.  */
typedef int __sig_atomic_t;

So, again, int is clearly atomic.

Here is some sample code to show what I am talking about...

...when I say that I want to know which types have naturally atomic reads, and naturally atomic writes, but not atomic increment, decrement, or compound assignment.

volatile bool shared_bool;
volatile uint8_t shared u8;
volatile uint16_t shared_u16;
volatile uint32_t shared_u32;
volatile uint64_t shared_u64;
volatile float shared_f; // 32-bits
volatile double shared_d; // 64-bits

// Task (thread) 1
while (true)
{
    // Write to the values in this thread.
    //
    // What I write to each variable will vary. Since other threads are reading
    // these values, I need to ensure my *writes* are atomic, or else I must
    // use a mutex to prevent another thread from reading a variable in the
    // middle of this thread's writing.
    shared_bool = true;
    shared_u8 = 129;
    shared_u16 = 10108;
    shared_u32 = 130890;
    shared_f = 1083.108;
    shared_d = 382.10830;
}

// Task (thread) 2
while (true)
{
    // Read from the values in this thread.
    //
    // What thread 1 writes into these values can change at any time, so I need
    // to ensure my *reads* are atomic, or else I'll need to use a mutex to
    // prevent the other thread from writing to a variable in the midst of
    // reading it in this thread.
    if (shared_bool == whatever)
    {
        // do something
    }
    if (shared_u8 == whatever)
    {
        // do something
    }
    if (shared_u16 == whatever)
    {
        // do something
    }
    if (shared_u32 == whatever)
    {
        // do something
    }
    if (shared_u64 == whatever)
    {
        // do something
    }
    if (shared_f == whatever)
    {
        // do something
    }
    if (shared_d == whatever)
    {
        // do something
    }
}

C `_Atomic` types and C `std::atomic<>` types

I know C11 and later offers _Atomic types, such as this:

const _Atomic int32_t i;
// or (same thing)
const atomic_int_least32_t i;

See here:

And C 11 and later offers std::atomic<> types, such as this:

const std::atomic<int32_t> i;
// or (same thing)
const atomic_int32_t i;

See here:

https://en.cppreference.com/w/cpp/atomic/atomic

And these C11 and C 11 "atomic" types offer atomic reads and atomic writes as well as atomic increment operator, decrement operator, and compound assignment...

...but that's not really what I'm talking about.

I want to know which types have naturally atomic reads and naturally atomic writes only. For what I am talking about, increment, decrement, and compound assignment will not be naturally atomic.

CodePudding user response：

The answer from the point of view of the language standard is very simple: none of them are "definitively automatically" atomic.

First of all, it's important to distinguish between two senses of "atomic".

One is atomic with respect to signals. This ensures, for instance, that when you do x = 5 on a sig_atomic_t, then a signal handler invoked in the current thread will see either the old or new value. This is usually accomplished simply by doing the access in one instruction, since signals can only be triggered by hardware interrupts, which can only arrive between instructions. For instance, x86 add dword ptr [var], 12345, even without a lock prefix, is atomic in this sense.
The other is atomic with respect to threads, so that another thread accessing the object concurrently will see a correct value. This is more difficult to get right. In particular, ordinary variables of type sig_atomic_t are not atomic with respect to threads. You need _Atomic or std::atomic to get that.

Note well that the internal names your implementation chooses for its types are not evidence of anything. From typedef int _Atomic_word; I would certainly not infer that "int is clearly atomic"; I don't know in what sense the implementers were using the word "atomic", or whether it's accurate (could be used by legacy code, for instance). If they wanted to make such a promise it would be in the documentation, not in an unexplained typedef in a bits header that is never meant to be seen by the application programmer.

The fact that your hardware may make certain types of access "automatically atomic" does not tell you anything at the level of C/C . For instance, it is true on x86 that ordinary full-size loads and stores to naturally aligned variables are atomic. But in the absence of std::atomic, the compiler is under no obligation to emit ordinary full-size loads and stores; it is entitled to be clever and access those variables in other ways. It "knows" this will be no problem, because concurrent access would be a data race, and of course the programmer would never write code with a data race, would they?

As a concrete example, consider the following code:

unsigned x;

unsigned foo(void) {
    return (x >> 8) & 0xffff;
}

A load of a nice 32-bit integer variable, followed by some arithmetic. What could be more innocent? Yet check out the assembly emitted by GCC 11.2 -O2 try on godbolt:

foo:
        movzx   eax, WORD PTR x[rip 1]
        ret

Oh dear. A partial load, and unaligned to boot. AFAIK x86 provides no atomicity promises about unaligned loads.

Here is another fun example, this time on ARM64. Aligned 64-bit stores are atomic, per B2.2.1 of the ARMv8-A Architecture Reference Manual. So this looks fine:

unsigned long x;

void bar(void) {
    x = 0xdeadbeefdeadbeef;
}

But, GCC 11.2 -O2 gives (godbolt):

bar:
        adrp    x1, .LANCHOR0
        add     x2, x1, :lo12:.LANCHOR0
        mov     w0, 48879
        movk    w0, 0xdead, lsl 16
        str     w0, [x1, #:lo12:.LANCHOR0]
        str     w0, [x2, 4]
        ret

That's two 32-bit strs, not atomic in any way. A reader may very well read 0x00000000deadbeef.

Why do it this way? Materializing a 64-bit constant in a register takes several instructions on ARM64, with its fixed instruction size. But both halves of the value are equal, so why not materialize the 32-bit value and store it to each half?

(If you do unsigned long *p; *p = 0xdeadbeefdeadbeef then you get stp w1, w1, [x0] (godbolt). Which looks more promising as it is a single instruction, but in fact is still two separate writes for purposes of atomicity between threads.)

It really isn't safe to assume anything beyond what the languages actually guarantees, which is nothing unless you use std::atomic. Modern compilers know the exact limits of the language rules very well, and optimize aggressively. They can and will break code that assumes they will do what would be "natural", if that is outside the bounds of what the language promises, and they will very often do it in ways that one would never expect.

CodePudding user response：

On 8-bit AVR microcontrollers (ex: the ATmega328 mcu, used by the Arduino Uno or Mini), only 8-bit data types have atomic reads and writes.

Only in case you write your code in assembler, not in C.

On (32-bit) STM32 microcontrollers, any data type 32-bits or smaller is definitively automatically atomic.

Only in case you write your code in assembler, not in C. Additionally, only if the ISA guarantees that the generated instruction is atomic, I don't remember if this is true for all ARM instructions.

That includes bool/_Bool, int8_t/uint8_t, int16_t/uint16_t, int32_t/uint32_t, float, and all pointers.

No, that is definitely wrong.

Now I need to know for my 64-bit Linux computer. Which types are definitively automatically atomic?

The same types as in AVR and STM32: none.

This all boils down to that a variable access in C cannot be guaranteed to be atomic because it might get carried out in multiple instructions. Or in some cases in instructions for which the ISA doesn't guarantee atomicity.

The only types that can be regarded as atomic in C (and C ) are those with the _Atomic qualifier from C11/C 11. Period.

This answer of mine at EE here is a duplicate. It addresses the microcontroller cases explicitly, race conditions, use of volatile, dangerous optimizations etc. It also contains a simple way to protect from race conditions in interrupts which is applicable to all MCUs where interrupts cannot be interrupted. A quote from that answer:

When writing C, all communication between an ISR and the background program must be protected against race conditions. Always, every time, no exceptions. The size of the MCU data bus does not matter, because even if you do a single 8 bit copy in C, the language cannot guarantee atomicity of operations. Not unless you use the C11 feature _Atomic. If this feature isn't available, you must use some manner of semaphore or disable the interrupt during read etc. Inline assembler is another option. volatile does not guarantee atomicity.

Here is some sample code to show what I am talking about...

C _Atomic types and C std::atomic<> types

C `_Atomic` types and C `std::atomic<>` types