I have a column of data and I am trying to remove decimal places from the data to get 56008265990 56013205307 etc.
It currently looks like this when I read it into python.
0 56008265990.000
1 56013205307.000
2 56000116799.000
3 56000959848.000
4 56010419025.000
...
49056 56000818137.000
49057 56000146564.000
49058 56001739190.000
49059 56002050665.000
49060 56003026564.000
Name: ID, Length: 49061, dtype: float64
There are some values with nan
I am running
df['ID'] = df['ID'].fillna(0)
df['ID'] = df['ID'].astype(int)
But when I print the result I am getting this output
0 -2147483648
1 -2147483648
2 -2147483648
3 -2147483648
4 -2147483648
...
49056 -2147483648
49057 -2147483648
49058 -2147483648
49059 -2147483648
49060 -2147483648
Name: ID, Length: 49061, dtype: int32
Any help is much appreciated!
CodePudding user response:
Use int64
for correct casting:
import numpy as np
df['ID'] = df['ID'].fillna(0).astype(np.int64)
print (df)
ID
0 56008265990
1 56013205307
2 56000116799
3 56000959848
4 56010419025
CodePudding user response:
Use:
df['ID'].astype(np.int64)
The reason is that int
type is limited to -2147483648 through 2147483647 and your numbers are out of [limitations][1].
56008265990.000>2147483647
On the other hand np.int64
is limited by -2**63
to 2**63 - 1
, which is in accordance with your needs.