Home > Mobile >  einsum not giving overflow error when applied to int arrays
einsum not giving overflow error when applied to int arrays

Time:05-23

I just had a bug which was based on np.sum and an equivalent (or at least I thought so...) np.einsum command not giving the same result. Here is an example:

import numpy.random
array = np.random.randint(-10000, 10000, size=(4, 100, 200, 600), dtype=np.int16)

sum1 = np.sum(array, axis=(0,1,2))
sum2 = np.einsum('aijt->t', array)

print(np.allclose(sum1, sum2))

plt.figure()
plt.plot(sum1)
plt.plot(sum2)
plt.show()

After some searching, this is due to overflow of the integer data type.

My question:

  • Why is np.einsum not giving the same result as np.sum here? I feel the np.sum behaviour is a lot more desirable leading to less errors.
  • Why does np.einsum not throw an overflow error or at least a warning? This is super scary in terms of getting hidden bugs when using it. Should I be checking for those by hand every time I use the command?
  • Would this considered be a bug in numpy?

CodePudding user response:

Define a large int16:

In [322]: y=np.int16(32000)

Addition produces a warning:

In [323]: y y
C:\Users\paul\AppData\Local\Temp\ipykernel_8828\1714217578.py:1: RuntimeWarning: overflow encountered in short_scalars
  y y
Out[323]: -1536

sum promotes them to a larger int, and no warning:

In [324]: np.sum((y,y))
Out[324]: 64000

In [325]: _.dtype
Out[325]: dtype('int32')

Make an array from that:

In [326]: Y = np.array(y)

Overflow without warning:

In [327]: Y Y
Out[327]: -1536

I don't recall the details, but it's been explained that checking each element of an array for overflow is/was considered to be too expensive.

Rather than checking 'by hand', just be aware of the overflow possibility, and don't use smaller dtypes unnecessarily.

A possible duplicate

Sum of positive numbers results in a negative number

  • Related