While writing an answer regarding how compilers must treat volatile
, I believe I may have stumbled upon a gcc bug and would like someone to verify before I report it.
I wrote up a simple function such as this:
int foo (int a, int b, int c)
{
b = a 1;
c = b 1;
a = c 1;
return a;
}
Without optimizations this results in a lot of pointless moving of data back and forth. With optimizations the compiler just grabs the register where a
was stored, then adds 3 and returns that result. To speak x86 lea eax, [rdi 3]
and ret
. This is expected, so far so good.
To demonstrate sequencing and volatile access, I changed the example to this:
int foo (int a, int b, int c)
{
b = a 1;
c = *(volatile int*)&b 1;
a = c 1;
return a;
}
Here there's a lvalue access of the contents of b
that is volatile qualified and as far as I can tell, the compiler is absolutely not allowed to optimize away that access1). From gcc 4.1.2 (and probably earlier) to gcc 10.3 I get conforming behavior (same in clang). The x86 machine code looks like this even with -O3
:
foo:
add edi, 1
mov DWORD PTR [rsp-4], edi
mov eax, DWORD PTR [rsp-4]
add eax, 2
ret
Then I try the same on gcc 11.1 and beyond, now I get:
foo:
lea eax, [rdi 3]
ret
https://godbolt.org/z/e5x74z3Kb
ARM gcc 11.1 does something similar.
Is this a compiler bug?
1) References: ISO/IEC 9899:2018 5.1.2.3, particularly §2, §4 and §6.
CodePudding user response:
Passing the address to a non-inline function makes GCC respect volatile
casts for later accesses (and maybe earlier, didn't check) to a function arg or local. https://godbolt.org/z/cssveev7n
I duplicated the c =
line and the asm contains two loads of b
thanks to the volatile cast, using GCC trunk.
void bar(void*);
int foo (int a, int b, int c)
{
bar(&b); // b's address has now "escaped" - potentially globally visible
b = a 1;
c = *(volatile int*)&b 1;
c = *(volatile int*)&b 1; // both accesses present.
a = c 1;
return a;
}
# GCC trunk -O3 -fverbose-asm
call bar #
mov DWORD PTR [rsp 12], ebx # b, tmp89
mov eax, DWORD PTR [rsp 12] # _2, MEM[(volatile int *)&b]
mov eax, DWORD PTR [rsp 12] # _3, MEM[(volatile int *)&b]
...
add eax, 2
ret
So this seems innocent except maybe in some microbenchmark use-cases; it's not going to break hand-rolled atomics using casts like these, such as the Linux kernel's READ_ONCE
/ WRITE_ONCE
macros.
Still arguably violating ISO C rules, if it's legal to alias a plain int
with a volatile int
. If not, it's only GCC defining behaviour, so it's up to GCC. I post this more as a data point than an argument in either direction on that aspect of the question.
CodePudding user response:
Per C18 5.1.2.3/6, accesses to volatile objects (strictly according to the rules of the abstract machine) are part of the observable behavior of the program, which all conforming implementations must reproduce. The term "access" in this context includes both reads and writes.
C18 5.1.2.3/2 and /4 reinforce that volatile accesses are needed side effects, excluded from the rule that implementations are allowed to avoid producing unneeded side effects.
The only out I see for GCC would be an argument that although (volatile int*)&b
is an lvalue with volatile
-qualified type, it can prove that the object it designates (b
) is not actually a "volatile object", which indeed it is not if you go by its declaration. And that is consistent with GCC 11.2's observed behavior for this version of the function:
int foo (int a, int b, int c)
{
volatile int bv = a 1;
c = bv 1;
a = c 1;
return a;
}
, which yields the same assembly as older versions of GCC do for the original code (godbolt).
Whether this constitutes a bug in the sense of non-conformance with the language standard is unclear, but certainly GCC is thwarting the apparent intent of the programmer.
CodePudding user response:
I heard a compiler team argue convincingly ( ok, I nearly fell asleep, so I got a rough outline ) that outside of an externally scoped word sized object, volatile was a meaningless decoration. Further the compiler provided some sort of traditional behaviour surrounding meaninglessly attributed objects as a convenience to people working with legacy code. This interpretation was based on an absurd reduction of the C standard which is better than correct, it is technically correct, the gold standard of alpha-geeks.