Convert nan to integer but get generic value-CodePudding

I have a column of data and I am trying to remove decimal places from the data to get 56008265990 56013205307 etc.

It currently looks like this when I read it into python.

0       56008265990.000
1       56013205307.000
2       56000116799.000
3       56000959848.000
4       56010419025.000
              ...      
49056   56000818137.000
49057   56000146564.000
49058   56001739190.000
49059   56002050665.000
49060   56003026564.000
Name: ID, Length: 49061, dtype: float64

There are some values with nan

I am running

df['ID'] = df['ID'].fillna(0)

df['ID'] = df['ID'].astype(int)

But when I print the result I am getting this output

0       -2147483648
1       -2147483648
2       -2147483648
3       -2147483648
4       -2147483648
            ...    
49056   -2147483648
49057   -2147483648
49058   -2147483648
49059   -2147483648
49060   -2147483648
Name: ID, Length: 49061, dtype: int32

Any help is much appreciated!

CodePudding user response：

Use int64 for correct casting:

import numpy as np

df['ID'] = df['ID'].fillna(0).astype(np.int64)
print (df)
            ID
0  56008265990
1  56013205307
2  56000116799
3  56000959848
4  56010419025

CodePudding user response：

Use:

df['ID'].astype(np.int64)

The reason is that int type is limited to -2147483648 through 2147483647 and your numbers are out of [limitations][1].

56008265990.000>2147483647

On the other hand np.int64 is limited by -2**63 to 2**63 - 1, which is in accordance with your needs.

[1]: http://python-reference.readthedocs.io/en/latest/docs/ints/#:~:text=These represent numbers in the,size, but not smaller.)