Imagine to have some vectors (could be a torch
tensor or a numpy
array) with a huge number of components, each one very small (~ 1e-10).
Let's say that we want to calculate the norm of one of these vectors (or the dot product between two of them). Also using a float64
data type the precision on each component will be ~1e-10, while the product of 2 component (during the norm/dot product computation) can easily reach ~1e-20 causing a lot of rounding errors that, summed up together return a wrong result.
Is there a way to deal with this situation? (For example is there a way to define arbitrary precision array for these operations, or some built in operator that take care of that automatically?)
CodePudding user response:
You are dealing with two different issues here:
Underflow / Overflow
Calculating the norm of very small values may underflow to zero when you calculate the square. Large values may overflow to infinity. This can be solved by using a stable norm algorithm. A simple way to deal with this is to scale the values temporarily. See for example this:
a = np.array((1e-30, 2e-30), dtype='f4')
np.linalg.norm(a) # result is 0 due to underflow in single precision
scale = 1. / np.max(np.abs(a))
np.linalg.norm(a * scale) / scale # result is 2.236e-30
This is now a two-pass algorithm because you have to iterate over all your data before determining a scaling value. If this is not to your liking, there are single-pass algorithms, though you probably don't want to implement them in Python. The classic would be Blue's algorithm: http://degiorgi.math.hr/~singer/aaa_sem/Float_Norm/p15-blue.pdf
A simpler but much less efficient way is to simply chain calls to hypot (which uses a stable algorithm). You should never do this, but just for completion:
norm = 0.
for value in a:
norm = math.hypot(norm, value)
Or even a hierarchical version like this to reduce the number of numpy calls:
norm = a
while len(norm) > 1:
hlen = len(norm) >> 1
front, back = norm[:hlen], norm[hlen: 2 * hlen]
tail = norm[2 * hlen:] # only present with length is not even
norm = np.append(np.hypot(front, back), tail)
norm = norm[0]
You are free to combine these strategies. For example if you don't have your data available all at once but blockwise (e.g. because the data set is too large and you read it from disk), you can pick a scaling value per block, then chain the blocks together with a few calls to hypot.
Rounding errors
You accumulate rounding errors, especially when accumulating values of different magnitude. If you accumulate values of different signs, you may also experience catastrophic cancelation. To avoid these issues, you need to use a compensated summation scheme. Python provides a very good one with math.fsum
.
So if you absolutely need highest accuracy, go with something like this:
math.sqrt(math.fsum(np.square(a * scale))) / scale
Note that this is overkill for a simple norm since there are no sign changes in the accumulation (so no cancelation) and the squaring increases all differences in magnitude so that the result will always be dominated by its largest components, unless you are dealing with a truly horrifying dataset. That numpy does not provide built-in solutions for these issues tells you that the naive algorithm is actually good enough for most real-world applications. No reason to go overboard with the implementation before you actually run into trouble.
Application to dot products
I've focused on the l2 norm because that is the case that is more generally understood to be hazardous. Of course you can apply similar strategies to a dot product.
np.dot(a, b)
ascale = 1. / np.max(np.abs(a))
bscale = 1. / np.max(np.abs(b))
np.dot(a * ascale, b * bscale) / (ascale * bscale)
This is particularly useful if you use mixed precision. For example the dot product could be calculated in single precision but the x / (ascale * bscale)
could take place in double or even extended precision.
And of course math.fsum
is still available: dot = math.fsum(a * b)
Bonus thoughts
The whole scaling itself introduces some rounding errors because no one guarantees you that a/b is exactly representable in floating point. However, you can avoid this by picking a scaling factor that is an exact power of 2. Multiplying with a power of 2 is always exact in FP (assuming you stay in the representable range). You can get the exponent with math.frexp