How to drop rows which have at least 1 element from both the lists? Looking for something iterative over more than 100 columns. Minimal example with 3 columns is:
list1 = ["abc1", "def"]
list2 = ["ghi", "ghj"]
df = pd.DataFrame({"index": [0,1,2,3,4,5,6,7,8],
"col1": ["abc1", "ghj", "ghi", "abc1", "","def","ghj","abc1","abc1"],
"col2": ["abc1", "abc1", "dfg", "dfg", "ghi","dfg","","ghj","abc1"],
"col3": ["abc1", "qrst", "dfg", "dfg", "dfg","dfg","abc1","ghi","abc1"]})
index col1 col2 col3
0 0 abc1 abc1 abc1
1 1 ghj abc1 qrst
2 2 ghi dfg dfg
3 3 abc1 dfg dfg
4 4 ghi dfg
5 5 def dfg dfg
6 6 ghj abc1
7 7 abc1 ghj ghi
8 8 abc1 abc1 abc
Row numbers 1, 6, 7 must be dropped because they have elements from both the lists. Finaldf should be:
index col1 col2 col3
0 0 abc1 abc1 abc1
1 2 ghi dfg dfg
2 3 abc1 dfg dfg
3 4 ghi dfg
4 5 def dfg dfg
5 8 abc1 abc1 abc1
CodePudding user response:
finaldf = df[ ~(df.isin(list1).any(axis=1) & df.isin(list2).any(axis=1)) ].reset_index(drop=True)
output:
index col1 col2 col3
0 0 abc1 abc1 abc1
1 2 ghi dfg dfg
2 3 abc1 dfg dfg
3 4 ghi dfg
4 5 def dfg dfg
5 8 abc1 abc1 abc1