Home > Blockchain >  How to filter a pandas dataframe using a variable filter
How to filter a pandas dataframe using a variable filter

Time:10-06

I have a dataframe that contains multiple columns:

print(usersData):

    First name     Last name     ID1       ID2        Description         ...
0       Thomas        Shelby   TS123     T_SHE               Yeah         ...
1         Jack          Down   JD124     J_DOW               Yeah         ...
2         John           Doe   JD125     J_DOE               Yeah         ...
3        Alice          Lang   AL127     A_LAN               Yeah         ...

And I want to filter this filter based on variable criterias. I need to be able to filter using a combination of attributes that need to match a specific request. For example, if should be able to filter by First name AND Last name, like so:

firstname = "Thomas"
lastname = "Shelby"
searchResult = usersData[usersData['First name'] == firstname & usersData['Lastname'] == lastname]

But I want it to be dynamic, meaning I want to be able to do something like:

criterias = {"First name": "Thomas", "Last name": "Shelby"}
searchResult = usersData[for criteria in criterias: (usersData[criteria] == criterias[criteria] &)]

Anyone has an idea of the best way to do that?

CodePudding user response:

You can use boolean indexing with pandas.DataFrame.iloc and pandas.Series.isin combined.

criterias = {"First name": "Thomas", "Last name": "Shelby"}

usersData.loc[usersData[criterias.keys()].isin(criterias.values()).all(axis = 1),: ]

# Output :

 First name Last name    ID1    ID2 Description
0     Thomas    Shelby  TS123  T_SHE        Yeah

CodePudding user response:

eval() can be used:

import pandas as pd
df=pd.DataFrame(data={'first_name':['Thomas','Jack','John','Alice'],'last_name':['Shelby','Down','Doe','Lang'],
                      'ID1':['TS123','JD124','JD125','AL127']})

criterias = {"first_name": "Thomas", "last_name": "Shelby","ID1":"TS123"}

con=[]
for key, value in criterias.items():
    con.append("(df['{}']=='{}')".format(key,value))
    
condition=' & '.join(con)

print(condition)
'''
"(df['first_name']=='Thomas') & (df['last_name']=='Shelby') & (df['ID1']=='TS123')"
'''
final=df[eval(condition)]
print(final)

'''
      first_name last_name    ID1
0     Thomas     Shelby       TS123
'''

Can be developed according to the use case.

CodePudding user response:

We can apply the filter conditions in a loop.

from copy import deepcopy
criteria = {"First name": "Thomas", "Last name": "Shelby"}
searchResult = deepcopy(usersData)
for c in criteria:
    searchResult = searchResult[searchResult[c] == criteria[c]]

  • Related