Home > Software engineering >  Drop rows if a list of values appear in both columns
Drop rows if a list of values appear in both columns

Time:10-14

I have the following df:

from to value 
A    B   2
Z    B   5
C    A   7
A    C   7
D    E   7
G    H   5
H    E   4
F    C   2
A    P   4
..

and a list of ids

lst = ['A', 'C', 'D', 'F']

I want to drop rows for which both from and to contain an element from the list.

Thus, the desired df:

from to value 
A    B   2
Z    B   5
D    E   7
G    H   5
H    E   4
A    P   4
..

CodePudding user response:

if you like sql:

df = pd.DataFrame([
    ['A','B',2],
    ['Z','B',5],
    ['C','A',7],
    ['A','C',7],
    ['D','E',7],
    ['G','H',5],
    ['H','E',4],
    ['G','C',2],
    ['A','P',4]
], columns=['From', 'To', 'value'])

lst = ['A', 'C', 'D', 'F']

df.query('not (From in @lst and To in @lst)')

    From    To  value
0   A   B   2
1   Z   B   5
4   D   E   7
5   G   H   5
6   H   E   4
7   G   C   2
8   A   P   4

(note capitalization of 'From' and 'To' to avoid keyword conflict)

CodePudding user response:

Use isin combined with all for boolean indexing:

lst = ['A', 'C', 'D', 'F']

out = df[~df[['from', 'to']].isin(lst).all(axis=1)]

Or combine two masks:

out = df[~(df['from'].isin(lst) & df['to'].isin(lst))]

output:

  from to  value
0    A  B      2
1    Z  B      5
4    D  E      7
5    G  H      5
6    H  E      4
8    A  P      4
  • Related