Home > Back-end >  Subset df in pandas where value is in multiple columns
Subset df in pandas where value is in multiple columns

Time:10-15

I have a df like so:

             A  age  B   C   D   E
0            0   58  0   1   0   0   
1            0   45  0   0  99   0   
2            1   41  0   1   1   0   
3            0   53  0   1  99  99   
4            0   28  0  99   0  99   

What I wish to do is get the subset of the df where columns B, C, D, or E equal 99, like so:

             A  age  B   C   D   E   
1            0   45  0   0  99   0      
3            0   53  0   1  99  99   
4            0   28  0  99   0  99 

Tried this:

df99 = df[df['B'] == 99 or df['C'] == 99 or df['D'] == 99 or df['E'] == 99]

but that returned this error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

CodePudding user response:

Followed advice from this article and did this:

df99 = df[(df['B'] == 99) | (df['C'] == 99) | (df['D'] == 99) | (df['E'] == 99)]

Replacing or with | and wrapping each argument within () solved the problem.

CodePudding user response:

here is one way to do it

# choose columns B-E using loc
# check if value is 99 in any of the rows

df[df.loc[:,"B":"E"].eq(99).any(axis=1)]
A   age     B   C   D   E
1   0   45  0   0   99  0
3   0   53  0   1   99  99
4   0   28  0   99  0   99
  • Related