Home > database >  How to delete an outlier from a np.where condition
How to delete an outlier from a np.where condition

Time:10-05

I have this dataframe that has an outlier, which I recognized through a boxplot. Then, I caught the value of it through np.where but the thing is, I don't know how to delete this value and its whole row from my dataframe so that I can get rid of the outlier.

This is the code I used for it so far:

sns.boxplot(x=df_cor_inc['rt'].astype(float))
outlier = np.where(df_cor_inc['rt'].astype(float)>50000)

Any help would be great. Thanks.

CodePudding user response:

No need for np.where, a simple boolean mask will do the trick:

df_cor_inc = df_cor_inc[df_cor_inc['rt'] <= 50000]]

Also, why are you casting df_cor_inc['rt'] as float? Is it not already numeric?

If you want to reset the indices of your dataframe, tack on a .reset_index(drop=True).

CodePudding user response:

Try this:

df_cor_inc[np.where(df_cor_inc['rt'].astype(float)>50000,False,True)]
  • Related