Why do I get different results for complex multiplication involving NaN with gcc depending on optimi-CodePudding

Consider the following code:

#include <math.h>
#include <complex.h>
#include <stdio.h>

int main()
{
    complex double A = CMPLX(-NAN, 0.0);
    complex double B = CMPLX(NAN, 0.0);

    printf("A = %e %e, B = %e %e\n", creal(A), cimag(A), creal(B), cimag(B));

    B  = 0.5 * A;

    printf("A = %e %e, B = %e %e\n", creal(A), cimag(A), creal(B), cimag(B));
}

on various amd64 Debian systems with gcc 8.5.0, 10.2.1 and gcc 12, this code produces the following output:

output with no optimisation

# cc -o test test.c -lm && ./test
A = -nan 0.000000e 00, B = nan 0.000000e 00
A = -nan 0.000000e 00, B = nan 0.000000e 00

output with -O1

# cc -O1 -o test test.c -lm && ./test
A = -nan 0.000000e 00, B = nan 0.000000e 00
A = -nan 0.000000e 00, B = -nan 0.000000e 00

I do not understand why the output differs. On FreeBSD, I was unable to ever get the “-O1” output. Instead, all versions of gcc I tested with all compiler flags always produced the “no optimisation” output.

Why does the output differ depending on optimisation flags? Is this a compiler bug or perhaps undefined beahviour? What can I do to avoid this problem?

CodePudding user response：

IEEE 754 does not specify the sign bit of the expression -NAN (or of any computation whatsoever involving a NAN). The only requirement C places on the macro NAN is that the result be a constant expression whose value is a quiet NaN. As far as I can see, nothing even guarantees that the precise NaN bit pattern produced be the same for every invocation of NAN.

According to IEEE-754, operations which propagate a NaN (like .5*-NAN) are required to propagate the NaN's payload (which comprises the mantissa after the first bit). But they are not required to propagate the sign bit; the operation may or may not follow the usual convention for sign bits.

In general, not only for NaNs, if there is more than one valid result from a computation, the C standard does not oblige compilers to choose the same valid result for every computation. There are cases where an intermediate result can legitimately be computed with precision beyond that of the datatype used, which can change the rounded value of the final result. Similarly, if there are two or more different NaN bit patterns which can result from an operation, the compiler is not obliged to arrange to be consistent. For example, the sum of two NaNs is a NaN, and the propagated payload requirement of IEEE-754 requires that the payload of the result be one of the payloads of the inputs. But it doesn't specify which one, so if the compiler happens to sometimes compute A B and other times B A, and those propagate different payloads, that's valid.

In particular, the compiler is free to fold constant computations at compile-time, and compile-time computations are not required to be bitwise identical with the actual computation which might have otherwise been emitted had the compiler not folded the constant. (Of course, they must be correct. But sometimes there is more than one correct bit-pattern.) I'm pretty sure that's what is happening here: GCC folds constants more aggressively at -O1 than at -O0. (Contrast that with Clang.)