Consider the below setup:
typedef struct
{
float d;
} InnerStruct;
typedef struct
{
InnerStruct **c;
} OuterStruct;
float TestFunc(OuterStruct *b)
{
float a = 0.0f;
for (int i = 0; i < 8; i )
a = b->c[i]->d;
return a;
}
The for loop in TestFunc exactly replicates one in another function that I'm testing. Both loops are unrolled by gcc (4.9.2) but yield slightly different assembly after doing so.
Assembly for my test loop:ㅤㅤㅤㅤㅤㅤㅤㅤㅤㅤAssembly for the original loop:
lwz r9,-0x725C(r13) lwz r9,0x4(r3)
lwz r8,0x4(r9) lwz r8,0x8(r9)
lwz r10,0x0(r9) lwz r10,0x4(r9)
lwz r11,0x8(r9) lwz r11,0x0C(r9)
lwz r4,0x4(r8) lwz r3,0x4(r8)
lwz r10,0x4(r10) lwz r10,0x4(r10)
lwz r8,0x4(r11) lwz r0,0x4(r11)
lwz r11,0x0C(r9) lwz r11,0x10(r9)
efsadd r4,r4,r10 efsadd r3,r3,r10
lwz r10,0x10(r9) lwz r8,0x14(r9)
lwz r7,0x4(r11) lwz r10,0x4(r11)
lwz r11,0x14(r9) lwz r11,0x18(r9)
efsadd r4,r4,r8 efsadd r3,r3,r0
lwz r8,0x4(r10) lwz r0,0x4(r8)
lwz r10,0x4(r11) lwz r8,0x0(r9)
lwz r11,0x18(r9) lwz r11,0x4(r11)
efsadd r4,r4,r7 efsadd r3,r3,r10
lwz r9,0x1C(r9) lwz r10,0x1C(r9)
lwz r11,0x4(r11) lwz r9,0x4(r8)
lwz r9,0x4(r9) efsadd r3,r3,r0
efsadd r4,r4,r8 lwz r0,0x4(r10)
efsadd r4,r4,r10 efsadd r3,r3,r11
efsadd r4,r4,r11 efsadd r3,r3,r9
efsadd r4,r4,r9 efsadd r3,r3,r0
The issue is the float values these instructions return are not exactly the same. And I can't change the original loop. I need to modify the test loop somehow to return the same values. I believe the test's assembly is equivalent to just adding each element one after another. I'm not very familiar with assembly so I wasn't sure how the above differences translated into c. I know this is the issue because if I add a print to the loops, they don't unroll and the results match exactly as expected.
CodePudding user response:
I presume this is for unit-testing the one function with another.
In general floating point calculations are never exact in C or C and it is not usually considered legitimate to expect them to be.
The Java language standard requires exact floating point results. Doing this is a constant source of hatred against Java, with various accusations that making the results reproducible usually makes them less accurate and sometimes makes the code much slower too.
If you are doing your testing in C or C then I would suggest this approach:
Calculate the result as best you can, with both high precision and high accuracy. In this case the input data are in 32-bit float, so convert them all to 64-bit float before calculating the expected result.
If the inputs were in double (and you don't have a bigger long double type) then sort the values into order and add them up smallest to largest. This will result in the least loss of accuracy.
Once you have your expected result then test that the function output matches it within some bounds.
There are two approaches to setting what accuracy you require to consider the test as a pass:
One approach is to check what the real physical meaning of the number is and what accuracy you actually require.
The other approach is to just require that the result is accurate to within a few least-significant-bits of the ideal result, ie: that the error is less than a few times the ideal result times FLT_EPSILON.
CodePudding user response:
Disabling fast-math seems to fix this issue. Thanks to @njuffa for the suggestion. I was hoping to be able to design the test function around this optimization, but it doesn't seem to be possible. At least I know what the issue is now. Appreciate everyone's help on the problem!