Home > Net >  Pandas to_numeric
Pandas to_numeric

Time:10-28

I am trying to use pandas.to_numeric() in order to convert the value of a column in my DataFrame to integers. The DataFrame is as follows:

QuestionID Value
0 Q1 150.0
1 Q2 160.0
2 Q3 NaN
3 Q4 210.0
4 Q5 Hello

How could I possibly convert the values to integers if I have NaN and Hello among the values using pandas.to_numeric() while also dropping the rows that cannot be converted?

My expected dataframe is as follows:

QuestionID Value
0 Q1 150
1 Q2 160
3 Q4 210

CodePudding user response:

'coerce' will return NaN for any non numeric value, which you can then drop those records with dropna.

df.assign(Value=pd.to_numeric(df.Value, errors='coerce')).dropna()

CodePudding user response:

df = pd.DataFrame([["Q1", "150"], ["Q2", "160"], ["Q3", "NaN"],
                   ["Q4", "210"], ["Q5", "Hello"]], columns=["QuestionID", "Value"])
df

  QuestionID  Value
0         Q1    150
1         Q2    160
2         Q3    NaN
3         Q4    210
4         Q5  Hello

Since you'd like to drop all invalid rows, I'd perhaps consider using the pd.Series.str.isnumeric() as an indexer:

df = df[df["Value"].str.isnumeric()]  # Keep rows with numeric values in "Value"
df.loc[:, "Value"] = df["Value"].astype(int)  # Cast to integers

Alternatively, building on @Chris suggestion, you can also add the integer type-casting after the df.assign call:

df.assign({"Value": pd.to_numeric(df["Value"], errors='coerce')).dropna().astype({"Value": int})
  • Related