I want to make a set of floating point numbers, but with a twist:
When testing if some float x is a member of the set s, I want the test to return true if s contains some float f such that
abs(x - f) < tol
In other words, if the set contains a number that is close to x, return true. Otherwise return false.
One way I thought of doing this is to store numbers in a heap rather than a hash set, and use an approximate equality rule to decide whether the heap contains a close number.
However, that would take log(N) time, which is not bad, but it would be nice to get O(1) if such an algorithm exists.
Does anyone have any ideas how this might be possible?
CodePudding user response:
One idea I just had (probably not the only way to do it) is to mask the lower N bits of the number to all 0's. For instance, if you want the tolerance to be approx. 1E-3, force the lower 10 bits of mantissa to 0 when adding. Do the same when checking.
One caveat of this approach is that real computers often do weird things to LSB's of mantissas when you're not looking. You store x = b00111111100000000000000000000000, and when you retrieve it you get 00111111100000000000000000000001, 001111110111111111111111111111111, etc. The reasons for this are many, but the bottom line is that it's still brittle. Anything that relies on float equality is brittle.
Interested to hear other ideas, critiques, ways of overcoming the problem etc.
CodePudding user response:
If you're not too fussy about the tolerance, then you can round each number to the closest multiple of tol/4.
You can then use a hash map, but when you add a number x, add floor(4x/tol), floor(4x/tol 1) and floor(4x/tol-1).
When you look up a number x, look up floor(4x/tol), floor(4x/tol 1) and floor(4x/tol-1).
You will certainly find a match within tol/2, and you may find a match within tol.