I have pandas dataframe like below:
dataframe 1 (name: df)
as you can see: each (A,B,C) has n X's and V's
and I made outlier df as
df_outlier = df[(df["V"] > 150)]
Then, I want to remove all (A,B,C) that includes in df_outlier
for example, if df_outlier looks like below:
I want to remove below rows from original dataframe:
First, I tried below codes:
df_filtered = pd.merge(df, df_outlier, indicator=True, how = 'outer').query('_merge=="left_only"').drop(['_merge'],axis=1)
However, it only remove rows in df_outlier, not all (a,b,c) rows in df_outlier
Sorry for my poor English skills, so if you fell harder to understand..
CodePudding user response:
Just select the column in df_outlier for check
df_filtered = pd.merge(df, df_outlier[['A','B','C']], indicator=True, how = 'outer').query('_merge=="left_only"').drop(['_merge'],axis=1)