Home > OS >  How to use a loop to check multiple conditions on multiple columns to filter a dataframe in Python
How to use a loop to check multiple conditions on multiple columns to filter a dataframe in Python

Time:08-02

I have a list containing names of the columns of a dataframe.

x = ['USPR', 'ESA', 'OSFI', 'APRA']

The values for these columns are either 'Yes' or 'No'.

I want to filter out rows that'd have any of the columns having 'Yes'. I want to use maybe a for loop to iterate through the list because the list is created from user input. So instead of having a static check like below:

df = df[(df['USPR'] == 'Yes') | (df['ESA'] == 'Yes') | (df['OSFI'] == 'Yes') | (df['APRA'] == 'Yes')]

I'm wondering how to make this dynamic using a loop, i.e. number of the conditions checked would be equal to the length of x. Or any other suggestion to achieve the outcome would also be appreciated.

Much thanks.

For the below sample dataframe:

enter image description here

I'm supposed to get the filtered dataframe as below:

enter image description here

CodePudding user response:

Something like this:

for i in x:
    print(f"checking.. {i}")
    if len(df[df[i] == 'yes']) > 0:
        print(f"true for {i}")
    else:
        print(f"false for {i}")

The output:

checking.. USPR
true for USPR
checking.. ESA
true for ESA
checking.. OSFI
false for OSFI

For remove the column with no

for i in range(len(x)):
    print(f"checking.. {x[i]}")
    if len(df[df[x[i]] == 'yes']) > 0:
        continue
    else:
        df.drop(x[i], axis=1, inplace=True)

CodePudding user response:

I do not know if I understand what you want... Anyway i will leave my suggestion

import pandas as pd
df = pd.DataFrame(['USPR', 'ESA', 'OSFI', 'APRA'], columns=['country'])
df

Output

country
0 USPR
1 ESA
2 OSFI
3 APRA

Create a function to store how many Ifs you want...

def foo(x):
    if x == 'USPR':
        return 'Yes'
    elif x == 'ESA':
        return 'Yes'
    elif x == 'OSFI':
        return 'Yes'
    elif x == 'APRA':
        return 'Yes'
df['new_col'] = df.country.apply(foo)

Output

country new_col
0 USPR Yes
1 ESA Yes
2 OSFI Yes
3 APRA Yes
  • Related