Home > database >  select rows from dataframe based on composite logic conditions
select rows from dataframe based on composite logic conditions

Time:11-17

I have a dataframe x, have three columns, A, B and C

import pandas as pd
X = pd.DataFrame()
X['A'] = [-100, 0, 2, 3]
X['B'] = [0, -100, 1, 2]
X['C'] = [0, 0, 0, 1]
X

      A     B   C
0   -100    0   0
1    0   -100   0
2    2     1    0
3    3     2    1

I would like to locate the rows which satisfy the following condition,

(If either column A or column B value equals to -100)   OR  (the column C value equals to 1)

In terms of the above example, lines [0,1,3] are what I want to get

I tried this one,

x1 = ( X[['A','B']]== -100 ) or (X['C'] == 1)

but get the following error message, what should be the right approach?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_23048/1363478251.py in <module>
----> 1 x1 = ( X[['A','B']]== -100 ) or (X['C'] == 1)

~\Anaconda3\envs\pyro\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
   1327 
   1328     def __nonzero__(self):
-> 1329         raise ValueError(
   1330             f"The truth value of a {type(self).__name__} is ambiguous. "
   1331             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

CodePudding user response:

There are 2 errors.

  1. you cannot use or with series, you should use the bitwise or: |

  2. you need to aggregate the comparisons on the left hand part using any

NB. I used A.eq(B) for clarity, but this is equivalent to A == B

x1 = X[['A','B']].eq(-100).any(axis=1) | X['C'].eq(1)

output:

0     True
1     True
2    False
3     True
dtype: bool
slicing the rows

to subselect the matching rows:

X[x1]

output:

     A    B  C
0 -100    0  0
1    0 -100  0
3    3    2  1
  • Related