Home > Enterprise >  pandas dataframe and/or condition syntax
pandas dataframe and/or condition syntax

Time:09-30

This pandas dataframe conditions work perfectly

    df2 = df1[(df1.A >= 1) | (df1.C >= 1) ]

But if I want to filter out rows where based on 2 conditions

(1) A>=1 & B=10 

(2) C >=1


        df2 = df1[(df1.A >= 1 & df1.B=10) | (df1.C >= 1) ]

giving me an error message

[ERROR] Cannot perform 'rand_' with a dtyped [object] array and scalar of type [bool]

Anyone can help? Thank you!

CodePudding user response:

One set of brackets is missing. Add brackets surrounding A and B individiually as well

Try this

df2 = df1[((df1.A >= 1) & (df1.B==10)) | (df1.C >= 1) ]

Example

df1 = pd.DataFrame({'A': [0,0,1,1,2,2], 'B': [0,10,0,10,0,10], 'C': [2,2,3,3,0,0]})
df1

A   B   C
0   0   0   2
1   0   10  2
2   1   0   3
3   1   10  3
4   2   0   0
5   2   10  0


df2 = df1[((df1.A >= 1) & (df1.B==10)) | (df1.C >= 1) ]
df2

    A   B   C
0   0   0   2
1   0   10  2
2   1   0   3
3   1   10  3
5   2   10  0

CodePudding user response:

Sometimes, putting brackets around each condition helps. So, instead of

df2 = df1[(df1.A >= 1 & df1.B=10) | (df1.C >= 1) ]

you would do

df2 = df1[((df1.A >= 1) & (df1.B=10)) | (df1.C >= 1) ]

  • Related