Home > Net >  pandas astype doesn't work as expected (fails silently and badly)
pandas astype doesn't work as expected (fails silently and badly)

Time:11-29

I've encountered this strange behavior of pandas .astype() (I'm using version 1.5.2). When trying to cast a column as integer, and later requesting dtypes, all seems fine. Until you try to extract the values by row, when you get inconsistent types.

Code:

import pandas as pd
import numpy as np
​
df = pd.DataFrame(np.random.randn(3, 3))
df.loc[:, 0] = df.loc[:, 0].astype(int)
​
print(df)
print(df.dtypes)
print(df.iloc[0, :])
print(type(df.values[0, 0]))

Out:

   0         1         2
0  0 -0.232432  1.025643
1 -1  0.556968 -0.729378
2 -1  1.285546 -0.541676
0      int64
1    float64
2    float64
dtype: object
0    0.000000
1   -0.232432
2    1.025643
Name: 0, dtype: float64
<class 'numpy.float64'>

Any guess of what I'm doing wrong here?

Tried to call without loc as

df[0] = df[0].astype(int)

dind't work either

CodePudding user response:

I think this is due to the usage of df.values because it will try to return a Numpy representation of the DataFrame. As per the docs

By default, the dtype of the returned array will be the common NumPy dtype of all types in the DataFrame.

>>> from pandas.core.dtypes.cast import find_common_type
>>> find_common_type(df.dtypes.to_list()) # df is your dataframe
dtype('float64')
  • Related