Home > Back-end >  Generic function to accurately round floating-point to the nearest multiple of X
Generic function to accurately round floating-point to the nearest multiple of X

Time:03-17

I am trying to write a generic function which rounds a double input value to the nearest multiple of X.

Due to floating-point precision reasons, the naive approach of just scaling and rounding can fail:

double round(double in, double multiple)
{
    return std::round(in / multiple) * multiple;
}

For example, since 0.15 is actually stored as ~0.14999999999999999445, the above function when called to round to a multiple of 0.1 will return 0.1 rather than 0.2, as can be seen in the following godbolt.

So perhaps I should add an epsilon value, but what epsilon would be generic and sufficient?

I see there is a std::numeric_limits<double>::epsilon(), and I see from this SO answer, I need to scale the value to the magnitude of the number I'm working with.

So should I use the multiple input parameter as my scaling factor?

If so, my round function becomes:

double round(double in, double multiple)
{
    const double epsilon = std::numeric_limits<double>::epsilon() * multiple;
    return std::round((in   epsilon) / multiple) * multiple;
}

Finally to make my function work for negative numbers, I use std::signbit to work out if I should subtract epsilon instead of add:

double round(double in, double multiple)
{
    const double epsilon = std::numeric_limits<double>::epsilon() * multiple;
    const double incr = std::signbit(in) ? -epsilon : epsilon;
    return std::round((in   incr) / multiple) * multiple;
}

I've tested this with a range of inputs and multiples, and for the inputs I've tried it seems to work... but I'm uncertain.

Is my final version of round accurate?

If not, under what circumstances / for what inputs will it fail to produce the result I want it to, and is it possible to produce an accurate and generic function?

CodePudding user response:

Your round function is working correctly. The value of 0.14999999999999999445 is rounded downwards, just as you woudl expect it to be. The problem you face is that double values can not represent arbitrary values due to the limited precision. Now consider the following program with your round() function:

double x = 0.15;
double y = 0.14999999999999999445;

std::cout << round(x, 0.1) << " ";
          << round(y, 0.1) << std::endl;

It outputs something like 0.1 0.1. Because x and y have exactly the same value internally you will never get different results. Now if you add a small value before rounding (the epsilon you mentioned) you will get 0.2 0.2. This might be correct for the 0.15 value but not for the other one anymore.

What I am trying to show you here is, that no matter how carefully you design your round() function, there will always be inconsitencies due to the precision limit. The rounding cutoff will always fluctuate. Some time it is a little bit above the real cutoff, some other time it may be a little bit below. There is nothing you can do about that.

So I am afraid, there is no satifying solution to your problem. You must accept a small fuzziness if you want to work with binary floating point numbers.

Addition
As you asked for an example where your second implementation (the one with the epsilon) fails, take a look here. For 0.35 the rounded result is rounded towards 0.3.

CodePudding user response:

In addition to @Jakob Stark:

  • OP expected 0.15, rounded to the nearest 0.1, to round to 0.20 and not 0.10. But none of 0.15, 0.1, 0.20 and 0.10 are exactly encodable in binary64. Instead nearby values** (1st rounding) are used and the printed "0.10" is the best result given those encoded numbers.

  • OP's issue is highlighted by using a value near the half-way point of 0.10 and 0.20 and not printing output to 17 significant decimal places to see the issue.

  • OP's round(in / multiple) * multiple compounds the issue even more as both the division and multiplication each incur rounding (2nd and 4th - the 3rd being round()).

  • The displayed output is a 5th rounding.

To properly round to the nearest multiple either requires extended precision math (perhaps long double if it is wider) or simply tolerate getting an unexpected rounding with these near half-way cases.

I recommend against adding an epsilon. Any non-crafted code is certain to create more corner errors. The initial steps to round to a multiple require exact math. Casual use of floating point */ - does not provide that.

Instead consider a narrower problem: double round_decimal(double x, int pow10). round_decimal(0.15, -1) will still round to 0.1000000000000000055... as the call is really round_decimal(0.1499999999999999944..., -1), but at least reasonable code can be made to provide the best answers. "Generic function to accurately round floating-point to the nearest multiple of X" is far more challenging.


**

0.100000000000000005551115123125782702118158340454101562500000
0.149999999999999994448884876874217297881841659545898437500000
0.200000000000000011102230246251565404236316680908203125000000
  • Related