I have a dataframe x
, have three columns, A, B and C
import pandas as pd
X = pd.DataFrame()
X['A'] = [-100, 0, 2, 3]
X['B'] = [0, -100, 1, 2]
X['C'] = [0, 0, 0, 1]
X
A B C
0 -100 0 0
1 0 -100 0
2 2 1 0
3 3 2 1
I would like to locate the rows which satisfy the following condition,
(If either column A or column B value equals to -100) OR (the column C value equals to 1)
In terms of the above example, lines [0,1,3] are what I want to get
I tried this one,
x1 = ( X[['A','B']]== -100 ) or (X['C'] == 1)
but get the following error message, what should be the right approach?
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_23048/1363478251.py in <module>
----> 1 x1 = ( X[['A','B']]== -100 ) or (X['C'] == 1)
~\Anaconda3\envs\pyro\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1327
1328 def __nonzero__(self):
-> 1329 raise ValueError(
1330 f"The truth value of a {type(self).__name__} is ambiguous. "
1331 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
CodePudding user response:
There are 2 errors.
you cannot use
or
with series, you should use the bitwise or:|
you need to aggregate the comparisons on the left hand part using
any
NB. I used A.eq(B)
for clarity, but this is equivalent to A == B
x1 = X[['A','B']].eq(-100).any(axis=1) | X['C'].eq(1)
output:
0 True
1 True
2 False
3 True
dtype: bool
slicing the rows
to subselect the matching rows:
X[x1]
output:
A B C
0 -100 0 0
1 0 -100 0
3 3 2 1