Home > OS >  Python: numpy.sum returns wrong ouput (numpy version 1.21.3)
Python: numpy.sum returns wrong ouput (numpy version 1.21.3)

Time:10-22

Here I have a 1D array:

>>> import numpy as np
>>> a = np.array([75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328])

And the sum of all elements in the array should be 75491328*8*8 = 4831444992. However, when I use np.sum, I get a different output.

>>> np.sum(a)
536477696

That's what happens on my Jupyter Notebook using the latest version of Numpy. But when I use Jupyter Notebook of Coursera using old version 1.18.4 of Numpy, everything is fine.

How can I fix this bug? Is it a bug or is it because of me? Please explain and help me fix this. Thanks in advance.

CodePudding user response:

This returns same answer as you expect:

$ ./venv/bin/python --version
Python 3.9.7

requirements.txt

numpy==1.21.3

main.py

#!./venv/bin/python

import numpy as np
a = np.array([75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328,
                  75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328, 75491328])


print("output should be \n 4831444992")
print("calculated output \n",np.sum(a))

output

output should be 
 4831444992
calculated output 
 4831444992

Any chance your actual code has different precision or an override for the datatypes ?

CodePudding user response:

The problem is caused by integer overflow, you should change the datatype of np.array to int64.

import numpy as np
np.array([Your values here], dtype=np.int64)
  • Related