I have a df like so:
A age B C D E
0 0 58 0 1 0 0
1 0 45 0 0 99 0
2 1 41 0 1 1 0
3 0 53 0 1 99 99
4 0 28 0 99 0 99
What I wish to do is get the subset of the df where columns B, C, D, or E equal 99, like so:
A age B C D E
1 0 45 0 0 99 0
3 0 53 0 1 99 99
4 0 28 0 99 0 99
Tried this:
df99 = df[df['B'] == 99 or df['C'] == 99 or df['D'] == 99 or df['E'] == 99]
but that returned this error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
CodePudding user response:
Followed advice from this article and did this:
df99 = df[(df['B'] == 99) | (df['C'] == 99) | (df['D'] == 99) | (df['E'] == 99)]
Replacing or with | and wrapping each argument within () solved the problem.
CodePudding user response:
here is one way to do it
# choose columns B-E using loc
# check if value is 99 in any of the rows
df[df.loc[:,"B":"E"].eq(99).any(axis=1)]
A age B C D E
1 0 45 0 0 99 0
3 0 53 0 1 99 99
4 0 28 0 99 0 99