Is it legal for the compiler to assume that a static variable will not be modified by another thread-CodePudding

I'm a bit surprised that the compiler (gcc) is simply assuming that a static variable will never be touched by other threads even with the lowest optimization level. I was trying to read a value written from another thread, but gcc simply thinks the value has never changed. Is reading a value of a static variable modified by another thread undefined behaviour per the standard?

I'm specifically asking about the assumption the compiler is making. Not about what happens when a program is not correctly handling thread synchronization.

Think about a case when thread B is waiting a signal from thread A, and thread A sends a signal after making modifications to some global static variables and then goes to sleep.

CodePudding user response：

According to §5.1.2.4 ¶25 and ¶4 of the ISO C11 standard, two different threads accessing the same memory location using non-atomic operations in an unordered fashion causes undefined behavior, if at least one thread is writing to that memory location.

Therefore, it is legal for a compiler to assume that no other thread will change a non-atomic non-volatile variable, unless the threads are synchronized in some way.

If thread synchronization is used (for example a mutex), then the compiler is no longer allowed to assume that a variable has not been modified by another thread, unless a memory order was used that allows the compiler to continue to make this assumption.

In your question, you state that you are attempting to order threads using "signals". However, in ISO C, "signals" cannot be used for thread synchronization. According to §7.14.1.1 ¶7 of the ISO C11 standard, using the function signal in a multithreaded program results in undefined behavior.

If you instead mean signalling a condition variable using the function cnd_signal, then yes, condition variables (which also use mutexes) can be used for proper thread synchronization.

If you are instead referring to platform-specific funtionality, then I cannot comment on that, as you did not specify any specific platform in your question.

CodePudding user response：

FOR THOSE WHO DO NOT READ AND DV. THIS ANSWER IS NOT RELATED THE IPC only answers the first question asked. IPC is too broad and complicated for a short SO answer. I do not write about race conditions, atomicity or coherency.

I'm a bit surprised that the compiler (gcc) is simply assuming that a static variable will never be touched by other threads even with the lowest optimization level.

5.1.2.4.4 in the standard reads "Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location."

You ask two distinct questions. The first one is about side effects. The second about IPC mechanisms.

I will answer only the first one as the second is too broad to be answered here on SO.

The compiler is assuming that objects (variables) can be changed only if the code changing them is in the normal program execution path.

If not, it assumes that those objects will not be changed.

But C has a special keyword volatile. It informs the compiler that volatile object is side effects prone - ie it can be changed by something outside the normal program execution path. The compiler will generate read form the object storage location every time it is used, and write the object storage location on every modification.

Example:

unsigned counter1;
volatile unsigned counter2;

int interruptHandler1(void)
{
    counter1  ;
}

void foo(void)
{
    while(1)
        if(counter1 > 100) printf("Larger!!!!");
}

int interruptHandler2(void)
{
    counter2  ;
}

void bar(void)
{
    while(1)
        if(counter2 > 100) printf("Larger!!!!");
}

Output code:

interruptHandler1:
        add     DWORD PTR counter1[rip], 1
        ret
.LC0:
        .string "Larger!!!!"
foo:
        cmp     DWORD PTR counter1[rip], 100
        ja      .L12
.L11:
        jmp     .L11
.L12:
        push    rax
.L4:
        xor     eax, eax
        mov     edi, OFFSET FLAT:.LC0
        call    printf
        cmp     DWORD PTR counter1[rip], 100
        ja      .L4
.L8:
        jmp     .L8
interruptHandler2:
        mov     eax, DWORD PTR counter2[rip]
        add     eax, 1
        mov     DWORD PTR counter2[rip], eax
        ret
bar:
.L20:
        mov     eax, DWORD PTR counter2[rip]
        cmp     eax, 100
        jbe     .L20
        sub     rsp, 8
.L19:
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        call    printf
.L15:
        mov     eax, DWORD PTR counter2[rip]
        cmp     eax, 100
        jbe     .L15
        jmp     .L19
counter2:
        .zero   4
counter1:
        .zero   4

volatile object will be read at any access from the permanent storage location:

int foo1(void)
{
    return counter1   counter1   counter1   counter1;
}

int bar1(void)
{
    return counter2   counter2   counter2   counter2;
}

foo1:
        mov     eax, DWORD PTR counter1[rip]
        sal     eax, 2
        ret
bar1:
        mov     eax, DWORD PTR counter2[rip]
        mov     esi, DWORD PTR counter2[rip]
        mov     ecx, DWORD PTR counter2[rip]
        mov     edx, DWORD PTR counter2[rip]
        add     eax, esi
        add     eax, ecx
        add     eax, edx
        ret

And saved on every modification:

void foo2(void)
{
    counter1  ;
    counter1  ;
    counter1  ;
    counter1  ;
}

void bar2(void)
{
    counter2  ;
    counter2  ;
    counter2  ;
    counter2  ;
}

foo2:
        add     DWORD PTR counter1[rip], 4
        ret
bar2:
        mov     eax, DWORD PTR counter2[rip]
        add     eax, 1
        mov     DWORD PTR counter2[rip], eax
        mov     eax, DWORD PTR counter2[rip]
        add     eax, 1
        mov     DWORD PTR counter2[rip], eax
        mov     eax, DWORD PTR counter2[rip]
        add     eax, 1
        mov     DWORD PTR counter2[rip], eax
        mov     eax, DWORD PTR counter2[rip]
        add     eax, 1
        mov     DWORD PTR counter2[rip], eax
        ret

CodePudding user response：

That only applies to data race conditions when "neither happens before the other". What if the value is read clearly after a modification from another thread?

"Happens before" is somewhat of a tricky concept. If the language standard says, "A happens before B," it does not mean that A always is guaranteed to happen before B in real time. Its meaning only becomes clear when we understand it as a transitive relationship: If, according to the standard, A "happens before" B, and B "happens before" C; then we can infer that A "happens before" C.

But, does A actually happen before C in real-time?

Let's imagine two threads. One of them updates a shared variable that is protected by a mutex:

void writer(...) {
    mytype_t new_value = create_new_value(...);

    pthread_mutex_lock(&mutex);
    global_var = new_value;
    pthread_mutex_unlock(&mutex);

The other thread accesses the same variable:

void reader(...) {
    mytype_t local_copy;

    pthread_mutex_lock(&mutex);
    local_copy = global_var;
    pthread_mutex_unlock(&mutex);

    do_something_with(local_copy);

One "happens before" rule, alluded to in a comment by user17732522, is that within any single thread, everything "happens" in program order. That is to say, because global_var = new_value; appears in the source code of the writer(...) function before pthread_mutex_unlock(&mutex); then the assignment must "happen before" the unlock within any one call to writer(...).

Another rule says that unlocking a mutex in one thread "happens before" some other thread locks the same mutex.

From these rules, we can infer that *IF* some thread A calls writer(...) and locks the mutex before some other thread B enters reader(...), then when thread B eventually acquires the mutex and reads the global_var, it will read the value that thread A wrote.

But that's a big "*IF*!" Nothing that I have shown in this example actually guarantees that thread A actually will call writer() before thread B calls reader(). You would have to add some higher-level inter-thread communication if you wanted to ensure that the threads actually did call those functions in any particular real-time order.