Home > Back-end >  My dataframe column output NaN for all values
My dataframe column output NaN for all values

Time:07-09

41-45          93
46-50          81
36-40          73
51-55          71
26-30          67
21-25          62
31-35          61
56-70          29
56-60          26
61 or older    23
15-20          10
Name: age, dtype: int64


pd.to_numeric(combined['age'], errors='coerce')

i used this above code to convert my dataframe column to numeric but all it does it convert it all to NaN values

Here is my output

3     NaN
5     NaN
8     NaN
9     NaN
11    NaN
       ..
696   NaN
697   NaN
698   NaN
699   NaN
701   NaN
Name: age, Length: 651, dtype: float64

CodePudding user response:

try the below:

import pandas as pd

df = pd.DataFrame({"age": ["41-45", "46-50","61 or older"], "Col2": [93, 81, 23]})

Cols = ["Lower_End_Age", "Higher_End_Age",] # list of column names for later

# replacing whitespace by delimiter and splitting only once `n=1` using the same delimiter 
df[Cols] = df["age"].str.replace(' ', '-').str.split("-", n=1, expand = True) 

print(df)
           age  Col2 Lower_End_Age Higher_End_Age
0        41-45    93            41             45
1        46-50    81            46             50
2  61 or older    23            61       or-older

later:

df['Lower_End_Age'] = pd.to_numeric(df['Lower_End_Age'], errors='coerce')

df.dtypes

age               object
Col2               int64
Lower_End_Age      int64
Higher_End_Age    object

and if you want to get rid of or-older, simply repeat

df['Higher_End_Age'] = pd.to_numeric(df['Higher_End_Age'], errors='coerce')

print(df)
           age  Col2  Lower_End_Age  Higher_End_Age
0        41-45    93             41            45.0
1        46-50    81             46            50.0
2  61 or older    23             61             NaN
  • Related