I have a list containing names of the columns of a dataframe.
x = ['USPR', 'ESA', 'OSFI', 'APRA']
The values for these columns are either 'Yes' or 'No'.
I want to filter out rows that'd have any of the columns having 'Yes'. I want to use maybe a for loop to iterate through the list because the list is created from user input. So instead of having a static check like below:
df = df[(df['USPR'] == 'Yes') | (df['ESA'] == 'Yes') | (df['OSFI'] == 'Yes') | (df['APRA'] == 'Yes')]
I'm wondering how to make this dynamic using a loop, i.e. number of the conditions checked would be equal to the length of x
. Or any other suggestion to achieve the outcome would also be appreciated.
Much thanks.
For the below sample dataframe:
I'm supposed to get the filtered dataframe as below:
CodePudding user response:
Something like this:
for i in x:
print(f"checking.. {i}")
if len(df[df[i] == 'yes']) > 0:
print(f"true for {i}")
else:
print(f"false for {i}")
The output:
checking.. USPR
true for USPR
checking.. ESA
true for ESA
checking.. OSFI
false for OSFI
For remove the column with no
for i in range(len(x)):
print(f"checking.. {x[i]}")
if len(df[df[x[i]] == 'yes']) > 0:
continue
else:
df.drop(x[i], axis=1, inplace=True)
CodePudding user response:
I do not know if I understand what you want... Anyway i will leave my suggestion
import pandas as pd
df = pd.DataFrame(['USPR', 'ESA', 'OSFI', 'APRA'], columns=['country'])
df
Output
country | |
---|---|
0 | USPR |
1 | ESA |
2 | OSFI |
3 | APRA |
Create a function to store how many Ifs you want...
def foo(x):
if x == 'USPR':
return 'Yes'
elif x == 'ESA':
return 'Yes'
elif x == 'OSFI':
return 'Yes'
elif x == 'APRA':
return 'Yes'
df['new_col'] = df.country.apply(foo)
Output
country | new_col | |
---|---|---|
0 | USPR | Yes |
1 | ESA | Yes |
2 | OSFI | Yes |
3 | APRA | Yes |