What's making these mathematically equivalent floating point equations give different answers?-CodePudding

So I know computers aren't especially 'good' at handling floats in many languages. I've seen 0.1 0.2 fail quite a few times in different languages. I've also learned that compilers can optimize a bit by evaluating some expressions during compilation. Knowing all that, I ran a little experiment.

Here's some C code I wrote:

#include <stdio.h>

float g1 = 0.1;
float g2 = 0.2;

int main() {
  float l1 = 0.1;
  float l2 = 0.2;
  printf("a: %.50f\n", 0.3);
  printf("b: %.50f\n", 0.1   0.2);
  printf("c: %.50f\n", (0.1 * 10   0.2 * 10) / 10);
  printf("d: %.50f\n", l1   l2);
  printf("e: %.50f\n", (l1 * 10   l2 * 10) / 10);
  printf("f: %.50f\n", g1   g2);
  printf("g: %.50f\n", (g1 * 10   g2 * 10) / 10);
  return 0;
}

Here's its output:

a: 0.29999999999999998889776975374843459576368331909180
b: 0.30000000000000004440892098500626161694526672363281
c: 0.29999999999999998889776975374843459576368331909180
d: 0.30000001192092895507812500000000000000000000000000
e: 0.30000001192092895507812500000000000000000000000000
f: 0.30000001192092895507812500000000000000000000000000
g: 0.30000001192092895507812500000000000000000000000000

It makes complete sense to be that "d," "e," "f" and "g" have the same result. I think "a," "b" and "c" being different from "d," "e," "f" and "g" is because of the difference between compile-time and run-time evaluation. However, I find it strange that "a" and "c" are the same but "b" is different.

Are my current understandings correct? Why are "a" and "c" the same while "b" is different?

CodePudding user response：

The literal (constant) values specified as arguments to printf in your "a", "b" and "c" cases are of type double, not float. Adding the f suffix, to make them float, and then running it after compiling with clang-cl gives the same answer for all cases:

#include <stdio.h>

float g1 = 0.1;
float g2 = 0.2;

int main() {
    float l1 = 0.1;
    float l2 = 0.2;
    printf("a: %.50f\n", 0.3f);
    printf("b: %.50f\n", 0.1f   0.2f);
    printf("c: %.50f\n", (0.1f * 10   0.2f * 10) / 10);
    printf("d: %.50f\n", l1   l2);
    printf("e: %.50f\n", (l1 * 10   l2 * 10) / 10);
    printf("f: %.50f\n", g1   g2);
    printf("g: %.50f\n", (g1 * 10   g2 * 10) / 10);
    return 0;
}

Output:

a: 0.30000001192092895507812500000000000000000000000000
b: 0.30000001192092895507812500000000000000000000000000
c: 0.30000001192092895507812500000000000000000000000000
d: 0.30000001192092895507812500000000000000000000000000
e: 0.30000001192092895507812500000000000000000000000000
f: 0.30000001192092895507812500000000000000000000000000
g: 0.30000001192092895507812500000000000000000000000000

(With your original code, the output I get is as you have quoted.)

Now, although float arguments to functions with variadic arguments (like printf) are promoted to double, how the compiler evaluates those expressions will likely be implementation-specific. Adding the explicit f suffix will (should?) remove any ambiguity.

CodePudding user response：

Most exact decimal fractions cannot be represented exactly as a binary floating point number with limited precision.

For IEEE 754 double precision format (C double), the closest representations of 0.1 and 0.2 are both slightly above the exact decimal value, so their sum is also slightly above the exact decimal sum. However, the closest representable value of 0.3 is slightly below the exact decimal value.

If the C compiler folds constant expressions, it should take the rounding rules into account. It should not replace the decimal expression 0.1 0.2 with the decimal constant 0.3 before converting to the binary representation. It should convert 0.1 and 0.2 to their binary representations and sum them to a single binary representation.

The example below illustrates that the output is the same whether the values come from constants or from variables assigned with the same constants (and of the same type as the constants, double in this case):

#include <stdio.h>

int main(void) {
    double d1 = 0.1;
    double d2 = 0.2;
    double d3 = 0.3;

    printf("0.1: %.50f\n", 0.1);
    printf("0.2: %.50f\n", 0.2);
    printf("0.1 0.2: %.50f\n", 0.1   0.2);
    printf("0.3: %.50f\n", 0.3);
    printf("\n");
    printf("d1: %.50f\n", d1);
    printf("d2: %.50f\n", d2);
    printf("d1 d2: %.50f\n", d1   d2);
    printf("d3: %.50f\n", d3);
    return 0;
}

Output:

0.1: 0.10000000000000000555111512312578270211815834045410
0.2: 0.20000000000000001110223024625156540423631668090820
0.1 0.2: 0.30000000000000004440892098500626161694526672363281
0.3: 0.29999999999999998889776975374843459576368331909180

d1: 0.10000000000000000555111512312578270211815834045410
d2: 0.20000000000000001110223024625156540423631668090820
d1 d2: 0.30000000000000004440892098500626161694526672363281
d3: 0.29999999999999998889776975374843459576368331909180

Consistency is a good thing!

For the IEEE 754 single precision format (C float), the closest representations of 0.1, 0.2, and 0.3 are all slightly above the exact decimal values. As long as we are careful to use the f suffix on the constants to make them type float instead of double, we still get consistent results for expressions using constants and expressions using variables assigned with those constants, as shown below:

#include <stdio.h>

int main(void) {
    float f1 = 0.1f;
    float f2 = 0.2f;
    float f3 = 0.3f;

    printf("0.1f: %.50f\n", 0.1f);
    printf("0.2f: %.50f\n", 0.2f);
    printf("0.1f 0.2f: %.50f\n", 0.1f   0.2f);
    printf("0.3f: %.50f\n", 0.3f);
    printf("\n");
    printf("f1: %.50f\n", f1);
    printf("f2: %.50f\n", f2);
    printf("f1 f2: %.50f\n", f1   f2);
    printf("f3: %.50f\n", f3);
    return 0;
}

Output:

0.1f: 0.10000000149011611938476562500000000000000000000000
0.2f: 0.20000000298023223876953125000000000000000000000000
0.1f 0.2f: 0.30000001192092895507812500000000000000000000000000
0.3f: 0.30000001192092895507812500000000000000000000000000

f1: 0.10000000149011611938476562500000000000000000000000
f2: 0.20000000298023223876953125000000000000000000000000
f1 f2: 0.30000001192092895507812500000000000000000000000000
f3: 0.30000001192092895507812500000000000000000000000000