Home > Software design >  Why does the dtype of a numpy array automatically change to 'object' if you multiply the a
Why does the dtype of a numpy array automatically change to 'object' if you multiply the a

Time:11-28

Given an arbitrary numpy array (its size and shape don't seem to play a role)

import numpy as np

a = np.array([1.])
print(a.dtype)  # float64

it changes its dtype if you multiply it with a number equal or larger than 10**20

print((a*10**19).dtype)  # float64
print((a*10**20).dtype)  # object

a *= 10**20  # Throws TypeError: ufunc 'multiply' output (typecode 'O') 
#             could not be coerced to provided output parameter (typecode 'd') 
#             according to the casting rule ''same_kind''

a *= 10.**20  # Throws numpy.core._exceptions._UFuncOutputCastingError: 
#             Cannot cast ufunc 'multiply' output from dtype('float64') to 
#             dtype('int32') with casting rule 'same_kind'

However, this doesn't happen if you multiply element-wise

a[0] *= 10**20  
print(a, a.dtype)  # [1.e 20] float64

or specifically convert the number to a float (or int)

a *= float(10**20)  
print(a, a.dtype)  # [1.e 20] float64

Just for the record, if you do the multiplication outside of numpy, there are no issues

b = 1.
print(type(b), type(10**20), type(10.**20))  # float int float

b *= 10**20
print(type(b))  # float

CodePudding user response:

I expect it is the size a "natural" integer can take on the system.

print(sys.maxsize, sys.getsizeof(sys.maxsize))
=> 9223372036854775807 36
print(10**19, sys.getsizeof(10**19))
=> 10000000000000000000 36

And this is where on my system the conversion to object starts, when I do

for i in range(1, 24):
    print(f'type of a*10**{i}:', (a * 10**i).dtype)

I do expect it is linked to the implementation of the integer:

PEP 0237: Essentially, long renamed to int. That is, there is only one built-in integral type, named int; but it behaves mostly like the old long type.

See https://docs.python.org/3.1/whatsnew/3.0.html#integers

To notice this, one could use numpy.multiply with a forced output type. This will throw an error and not silently convert (similar to your *= example).

  • Related