Home > Software design >  Finding rows which match both conditions
Finding rows which match both conditions

Time:05-30

Say I have a dataframe:

Do Re Mi Fa So
1 0 Foo 100 50
1 1 Bar 75 20
0 0 True 59 59
1 1 False 0 12

How would I go about finding all rows where the value in columns "Do" and "Re" BOTH equal 1 AND "Fa" is higher than "So"?

I've tried a couple of ways, but they first returns an error complaining of ambiguity: df['both_1']=((df['Do'] > df['Re']) & (df['Mi'] == df['Fa'] == 1))

I also tried breaking it down into steps, but I realised the last step will result in me bringing in both True and False statements. I only want True.

    
    df['Do_1'] = df['Do'] == 1
    
    df['Re_1'] = df['Re'] == 1
    
    # This is where I realised I'm bringing in the False rows too
    df['both_1'] = (df['Do1'] == df['Re_1'])```

CodePudding user response:

Chain mask by another & for bitwise AND:

df['both_1']= (df['Fa'] > df['So']) & (df['Do'] == 1) & (df['Re'] == 1)

print (df)
   Do  Re     Mi   Fa  So  both_1
0   1   0    Foo  100  50   False
1   1   1    Bar   75  20    True
2   0   0   True   59  59   False
3   1   1  False    0  12   False

Or if possible multiple columns in list comapre filtered columns by subset df[['Do', 'Re']] and test if all Trues by DataFrame.all:

df['both_1']= (df['Fa'] > df['So']) & (df[['Do', 'Re']] == 1).all(axis=1)

If need filter use boolean indexing:

df1 = df[(df['Fa'] > df['So']) & (df['Do'] == 1) & (df['Re'] == 1)]

For second solution:

df1 = df[(df['Fa'] > df['So']) & (df[['Do', 'Re']] == 1).all(axis=1)]

CodePudding user response:

You can combine the three conditions using & (bitwise and):

df[(df['Do'] == 1) & (df['Re'] == 1) & (df['Fa'] > df['So'])]

Output (for your sample data):

   Do  Re    Mi  Fa  So
1   1   1  Bar   75  20

CodePudding user response:

Try:

import numpy as np
import pandas as pd

df = pd.DataFrame({'Do':[1,1,0,1],'Re':[0,1,0,1],'Mi':['Foo','Bar',True,False],'Fa':[100,75,59,0],'Sol':[50,20,59,12]})

df.loc[np.where((df.Do == 1) & (df.Re ==1) & (df.Fa > df.Sol))[0]]

Output:
   Do  Re   Mi  Fa  Sol
    1   1   1  Bar  75   20

If you do not want to import anything else, just do this:

df.assign(keep=[1 if (df.loc[i,'Do'] == 1 and df.loc[i,'Re'] == 1 and df.loc[i,'Fa'] > df.loc[i,'Sol']) else 0 for i in df.index]).query("keep == 1").drop('keep',axis=1)
  • Related