I have an excel table that use as reference for logical operators so I can join them later to apply a logical string to pandas dataframe.
dataframe
GOOD BAD UGLY
0 101 60 0
1 22 61 0
2 103 62 NaN
3 104 63 0
I can get values from the excel sheet and append them into list. But How can parse this logical formulas to df
?
import pandas as pd
import openpyxl
def create_dataframe():
df = pd.DataFrame({'GOOD': [101,22,103,104],
'BAD': [60,61,62,63],
'UGLY': [0,0,'NaN',0],
})
print(df)
read_filter = pd.read_excel('test.xlsx')
print(read_filter)
formulas = []
logicals = ['>','<']
for i, filter_col in enumerate(read_filter['col1']):
if read_filter['Logic'][i] in logicals:
formula = f"df['{filter_col}'][{i}]" read_filter['Logic'][i] str(read_filter['value'][i])
formulas.append(formula)
else:
formula = f"{read_filter['Logic'][i]}(df['{filter_col}'])"
formulas.append(formula)
#
print(formulas)
#df['Result'] = df.apply(lambda x: eval(formulas) , axis=1)
return df
formulas ----
["df['GOOD'][0]>100", "df['BAD'][1]<50", "pd.isna(df['UGLY'])"]
The expected result :
GOOD BAD UGLY Result
0 101 60 0 False
1 22 61 0 False
2 103 62 True
3 104 63 0 False
CodePudding user response:
You can create the full condition like this:
>>> ' & '.join(f"({f})" for f in formulas)
"(df['GOOD'][0]>100) & (df['BAD'][1]>50) & (pd.isna(df['UGLY']))"
Each expression should be put in parentheses. Otherwise a > b & c > d
will be parsed as a > (b & c) > d
, not (a > b) & (c > d)
.
Then eval
it:
>>> import pandas as pd
>>> df = pd.DataFrame({'GOOD': [101,22,103,104], 'BAD': [60,61,62,63], 'UGLY': [0,0,float('nan'),0]})
>>> formulas = ["df['GOOD'][0]>100", "df['BAD'][1]<50", "pd.isna(df['UGLY'])"]
>>> eval(' & '.join(f"({f})" for f in formulas), {'df': df, 'pd': pd})
0 False
1 False
2 True
3 False
Name: UGLY, dtype: bool
Then you can create a column with this result:
>>> df.assign(Result=eval(' & '.join(f"({f})" for f in formulas), {'df': df, 'pd': pd}))
GOOD BAD UGLY Result
0 101 60 0.0 False
1 22 61 0.0 False
2 103 62 NaN True
3 104 63 0.0 False