To clean up data I want to convert several columns that should be numeric to numeric. Assignment of multiple columns has no effect.
Minimal (non-)working example.
The column 'S' is a string, 'D' and 'L' should be converted to numeric. The dtypes
are object
- as expected
1)
x=pd.DataFrame([ ['S',1,5],
['M',1.4,'10'],
['L','2.3',14.5]],
columns=['S', 'D', 'L'])
x.dtypes
S object
D object
L object
dtype: object
I tried the conversion using pd.to_numeric
and astype(float)
on the slice. It does not work
x.loc[:, ['D','L'] ] = x.loc[:, ['D', 'L']].apply(pd.to_numeric)
x.dtypes
S object
D object
L object
dtype: object
3 Check: creating a new dataframe with the righthand side does work and gives the correct type (float64)
x.loc[:, ['D', 'L']].apply(pd.to_numeric).dtypes
D float64
L float64
dtype: object
4 What does work is assigning a single column.
x.loc[:, 'D' ] = x.loc[:, 'D'].apply(pd.to_numeric)
yields the correct types for column 'D'.
The weird thing is after assinging a single column with a corrected type the assignment of multiple columns (as in 2) works ?!?
5
x.loc[:, ['D','L'] ] = x.loc[:, ['D', 'L']].apply(pd.to_numeric)
x.dtypes
S object
D float64
L float64
dtype: object
I am not sure if it has to do with views vs. copies but even when creating a deep-copy (copy(deep=True)
) the assignment of a slice with multiple columns has no effect.
Why does this not work? and How does pandas expect to work with multiple columns?
CodePudding user response:
This one works for me:
x[['D', 'L']] = x[['D', 'L']].astype(float)
CodePudding user response:
Get rid of the loc and row splice: x[['D', 'L']] = x[['D', 'L']].apply(pd.to_numeric)