I've a Data Frame
like this. I'm trying to use pd.numeric
pd.numeric
import pandas as pd
import numpy as np
series = pd.Series([0,1,2,2,3,4,7,8.2,"stackoverflow",7,9.9])
df = pd.to_numeric(series, errors="coerce")
print(df)
output I got as expected
0 0.0
1 1.0
2 2.0
3 2.0
4 3.0
5 4.0
6 7.0
7 8.2
|------------------
8 NaN
|-------------------
9 7.0
10 9.9
|-----------------------
dtype: float64
|--------------------------
when I use with pd series
giving me different outputs
import pandas as pd
import numpy as np
series = pd.Series([0,1,2,2,3,np.array(4),7,8.2,"stackoverflow",7,9.9])
df = pd.to_numeric(series, errors="coerce")
print(df)
Output I got in case 2 not as expected. It not even converting string
to Nan as it done above example.It's not even converting to dtype
as float
0 0
1 1
2 2
3 2
4 3
5 4
6 7
7 8.2
-----------------------------------
8 stackoverflow
----------------------------------
9 7
10 9.9
dtype: object
CodePudding user response:
Please note that both your df
variables have <class 'pandas.core.series.Series'>
type
After some manipulations, came up with the fix for your case:
def convert_list_to_series(data_list):
series = pd.Series(data_list)
return series.apply(pd.to_numeric, **{"errors": "coerce"})
data_list = [0, 1, 2, 2, 3, 4, 7, 8.2, "stackoverflow", 7, 9.9]
print(convert_list_to_series(data_list))
Output:
0 0.0
1 1.0
2 2.0
3 2.0
4 3.0
5 4.0
6 7.0
7 8.2
8 NaN
9 7.0
10 9.9
dtype: float64
data_list = [0, 1, 2, 2, 3, np.array(4), 7, 8.2, "stackoverflow", 7, 9.9]
print(convert_list_to_series(data_list))
Output:
0 0.0
1 1.0
2 2.0
3 2.0
4 3.0
5 4.0
6 7.0
7 8.2
8 NaN
9 7.0
10 9.9
dtype: float64
CodePudding user response:
This is strange behaviour. I guess it because values
are not scalar
.
Just create a np array list
of particular row & try uisng pd.to_numeric
import pandas as pd
import numpy as np
series = pd.Series([0,1,2,2,3,np.array(4),7,8.2,"stackoverflow",7,9.9])
df = pd.DataFrame(series)
serieslist = np.array(df[0].tolist())
df['new'] = (pd.to_numeric(serieslist , errors='coerce'))
print(df)
output
0 new
0 0 0.0
1 1 1.0
2 2 2.0
3 2 2.0
4 3 3.0
5 4 4.0
6 7 7.0
7 8.2 8.2
8 stackoverflow NaN
9 7 7.0
10 9.9 9.9
>>>