Home > Enterprise >  pandas dataframe: remove all rows that includes in other dataframe
pandas dataframe: remove all rows that includes in other dataframe

Time:09-02

I have pandas dataframe like below:

dataframe 1 (name: df)

enter image description here

as you can see: each (A,B,C) has n X's and V's

and I made outlier df as

df_outlier = df[(df["V"] > 150)]

Then, I want to remove all (A,B,C) that includes in df_outlier

for example, if df_outlier looks like below:

enter image description here

I want to remove below rows from original dataframe: enter image description here

First, I tried below codes:

df_filtered = pd.merge(df, df_outlier, indicator=True, how = 'outer').query('_merge=="left_only"').drop(['_merge'],axis=1)

However, it only remove rows in df_outlier, not all (a,b,c) rows in df_outlier

Sorry for my poor English skills, so if you fell harder to understand..

CodePudding user response:

Just select the column in df_outlier for check

df_filtered = pd.merge(df, df_outlier[['A','B','C']], indicator=True, how = 'outer').query('_merge=="left_only"').drop(['_merge'],axis=1)
  • Related