Home > database >  Pandas: Convert multiple columns in DataFrame from object to float
Pandas: Convert multiple columns in DataFrame from object to float

Time:01-17

To clean up data I want to convert several columns that should be numeric to numeric. Assignment of multiple columns has no effect.

Minimal (non-)working example. The column 'S' is a string, 'D' and 'L' should be converted to numeric. The dtypes are object - as expected

1)

x=pd.DataFrame([ ['S',1,5], 
                 ['M',1.4,'10'], 
                 ['L','2.3',14.5]], 
               columns=['S', 'D', 'L'])
x.dtypes
  S    object
  D    object
  L    object
dtype: object

I tried the conversion using pd.to_numeric and astype(float) on the slice. It does not work

x.loc[:, ['D','L'] ] = x.loc[:, ['D', 'L']].apply(pd.to_numeric)
x.dtypes
 S    object
 D    object
 L    object
 dtype: object

3 Check: creating a new dataframe with the righthand side does work and gives the correct type (float64)

x.loc[:, ['D', 'L']].apply(pd.to_numeric).dtypes
D    float64
L    float64
dtype: object

4 What does work is assigning a single column.

x.loc[:, 'D' ] = x.loc[:, 'D'].apply(pd.to_numeric)

yields the correct types for column 'D'.

The weird thing is after assinging a single column with a corrected type the assignment of multiple columns (as in 2) works ?!?

5

x.loc[:, ['D','L'] ] = x.loc[:, ['D', 'L']].apply(pd.to_numeric) 
x.dtypes
S     object
D    float64
L    float64
dtype: object

I am not sure if it has to do with views vs. copies but even when creating a deep-copy (copy(deep=True)) the assignment of a slice with multiple columns has no effect.

Why does this not work? and How does pandas expect to work with multiple columns?

CodePudding user response:

This one works for me:

x[['D', 'L']] = x[['D', 'L']].astype(float)

CodePudding user response:

Get rid of the loc and row splice: x[['D', 'L']] = x[['D', 'L']].apply(pd.to_numeric)

  • Related