How does sorting algorithms sort containers and ranges of floats?-CodePudding

Since comparing floats is evil then if I have a container of floats and I sort it using some standard library sorting algorithm like std::sort then how does the algorithm sort them?

std::vector<float> vf{2.4f, 1.05f, 1.05f, 2.39f};
std::sort( vf.begin(), vf.end() );

So does the algorithm compare 1.05f and 1.05f?
Does it internally uses something like: std::fabs( 1.05f - 1.05f ) < 0.1;?
Does this apply too to containers of doubles? Thank you!

CodePudding user response：

So does the algorithm compare 1.05f and 1.05f?

Yes

Does it internally uses something like: std::fabs( 1.05f - 1.05f ) < 0.1;?

No, it uses operator <, e.g. 1.05f < 1.05f. It doesn't ever need to compare for equality, so the comparison using epsilon value is not needed.

Does this apply too to containers of doubles?

Yes, it applies to containers of any type (unless you provide your own comparison function to std::sort)

CodePudding user response：

Since comparing floats is evil…

That is a myth and is false. Related to it is the myth that floating-point numbers approximate real numbers.

<, <=, >, and >= work fine for sorting numbers, and == and != also correctly compare whether two numbers are equal. <, <=, >, and >= will not serve when sorting data containing NaNs (the “Not a Number” datum).

Per the IEEE 754 Standard for Floating-Point Arithmetic and other common specifications of floating-point formats, any floating-point representation ±F•b^e represents one number exactly. Even ∞ and −∞ are considered to be exact. It is the operations in floating-point arithmetic, not the numbers, that are specified to approximate real-number arithmetic. When a computational operation is performed, its result is the real-number results rounded to the nearest representable number according to a chosen rounding rule, except that operations with domain errors may produce a NaN. (Round-to-nearest-ties-to-even is the most common rule, and there are several others.)

Thus, when you add or subtract numbers, multiply or divide numbers, take square roots, or convert numbers from one base to another (such as decimal character input to internal floating-point format), there may be a rounding error.

Some operations have no error. Comparison operations have no error. <, <=, >, >=, ==, and != always produce the true mathematical result, true or false, with no error. And therefore they may be used for sorting numbers. (With NaNs, they always produce false, because a NaN is never less than, equal to, or greater than a number, nor is a NaN less than, equal to, or greater than a NaN. So false is the correct result, but it means these comparisons are not useful for sorting with NaNs.)

Understanding this distinction, that numbers are exact and operations may approximate, is essential for analyzing, designing, and proving algorithms involving floating-point arithmetic.

Of course, it may happen that in an array of numbers, the numbers contain errors from earlier operations: They differ from the numbers that would have been obtained by using real-number arithmetic. In this case, the numbers will be sorted into order based on their actual computed values, not on the values you would ideally like them to have. This is not a barrier to std::sort correctly sorting the numbers.

To sort data including NaNs, you need a total order predicate that reports whether one datum is earlier than another in the desired sort order, such as one that reports a NaN is later than any non-NaN. IEEE-754 defines a total order predicate, but I cannot speak to its availability in C . (std::less does not seem to provide it, based on a quick test I tried.)