Home > database >  How to drop rows (or subset other rows) based on values in lists in pandas? Create mutually exclusiv
How to drop rows (or subset other rows) based on values in lists in pandas? Create mutually exclusiv

Time:12-26

How to drop rows which have at least 1 element from both the lists? Looking for something iterative over more than 100 columns. Minimal example with 3 columns is:

list1 = ["abc1", "def"]
list2 = ["ghi", "ghj"]
df = pd.DataFrame({"index": [0,1,2,3,4,5,6,7,8], 
                     "col1": ["abc1", "ghj", "ghi", "abc1", "","def","ghj","abc1","abc1"], 
                     "col2": ["abc1", "abc1", "dfg", "dfg", "ghi","dfg","","ghj","abc1"], 
                     "col3": ["abc1", "qrst", "dfg", "dfg", "dfg","dfg","abc1","ghi","abc1"]})
  index col1    col2    col3
0   0   abc1    abc1    abc1
1   1   ghj     abc1    qrst
2   2   ghi     dfg     dfg
3   3   abc1    dfg     dfg
4   4           ghi     dfg
5   5   def     dfg     dfg
6   6   ghj             abc1
7   7   abc1    ghj     ghi
8   8   abc1    abc1    abc

Row numbers 1, 6, 7 must be dropped because they have elements from both the lists. Finaldf should be:

 index  col1    col2    col3
0   0   abc1    abc1    abc1
1   2   ghi     dfg     dfg
2   3   abc1    dfg     dfg
3   4           ghi     dfg
4   5   def     dfg     dfg
5   8   abc1    abc1    abc1

CodePudding user response:

finaldf = df[ ~(df.isin(list1).any(axis=1) & df.isin(list2).any(axis=1)) ].reset_index(drop=True)

output:

   index  col1  col2  col3
0      0  abc1  abc1  abc1
1      2   ghi   dfg   dfg
2      3  abc1   dfg   dfg
3      4         ghi   dfg
4      5   def   dfg   dfg
5      8  abc1  abc1  abc1
  • Related