Home > Net >  Dataframe drop rows where multiple columns have the same value
Dataframe drop rows where multiple columns have the same value

Time:12-12

My dataframe has the columns A, B, C, label1, label2, label3. I just want to drop the rows where label1 = label2 = label3. The label value can be 0, 1, 2, 3 and nan The best solution I've found so far is this

df = df.drop(df[(df['label1'] == df['label2']) & (df['label1'] == df['label3'])].index)

Is there any other way I could solve this problem since the code above feels wrong?

CodePudding user response:

For solution for working with multiple columns by list is possible filter them first and then compare if not equal all filtered values in df1 by first column with DataFrame.any for filter all rows if not same values - it is same like drop rows with same values:

print (df)
   A  B  C  label1  label2  label3
0  4  5  6       0       0       0
1  1  2  3       7       4       5

df1 = df[['label1','label2', 'label3']]

df = df[df1.ne(df1.iloc[:, 0], axis=0).any(axis=1)]
print (df)
   A  B  C  label1  label2  label3
1  1  2  3       7       4       5
  • Related