I have a function/method written inside a class:
def minimal_percentage(self, input_col=None, perc=5, conditions=None, **kwargs):
print("name self")
temp = ["" for i in self.df[input_col] if len(str(i)) == 0]
print("after self")
lendf = self.df[input_col].shape[0]
print("2 self")
if((len(temp)/lendf) < lendf*perc/100):
conditions = conditions.replace("df","self.df")
print(conditions)
print(type(conditions))
eval(conditions)
This function gets called internally by parsing the json and knowing the function name from it and calls.
{
"sheet_minimal_percentage":{
"kwargs":{
"input_col": "date of birth",
"perc":5,
"conditions": "df[input_col].map(lambda x: f'09/09/{np.array([pd.to_datetime(i).year for i in df[input_col][~df[input_col].isna()]]).sum()// (df[input_col].count())}' if (str(x) == 'nan' or str(x) == '') else pd.to_datetime(x))"
}
}
}
I got the following:
output saying self is not defined occured in function class method
But the print statements after self and 2 self got executed without any error. When it came to the eval only, it is showing the error. Could you help me to resolve the error?
STEP minimal_percentage
name self
after self
2 self
self.df[input_col].map(lambda x: f'09/09/{np.array([pd.to_datetime(i).year for i in self.df[input_col][~self.df[input_col].isna()]]).sum()// (self.df[input_col].count())}' if (str(x) == 'nan' or str(x) == '') else pd.to_datetime(x))
<class 'str'>
Exception name 'self' is not defined occured in function <class 'method'>
When I loaded the dataframe from excel and checked the condition, it is executing correctly.
import pandas as pd
ddf= pd.read_excel("C:/Users/ShreeHarsha/Documents/test2.xlsx")
t = "ddf[input_col].map(lambda x: f'09/09/{np.array([pd.to_datetime(i).year for i in ddf[input_col][~ddf[input_col].isna()]]).sum()// (ddf[input_col].count())}' if (str(x) == 'nan' or str(x) == '') else pd.to_datetime(x))"
input_col = "DATE OF BIRTH"
import numpy as np
eval(st)
Got the output where empty string has to be replaced by 09/09/avg
year of columns ,and here is output:
0 1983-06-11 00:00:00
1 1986-02-23 00:00:00
2 1998-03-25 00:00:00
3 1977-04-23 00:00:00
4 1981-04-28 00:00:00
5 1953-04-02 00:00:00
6 1992-01-23 00:00:00
7 1974-10-03 00:00:00
8 1961-06-30 00:00:00
9 1973-07-06 00:00:00
10 1977-10-25 00:00:00
11 2001-02-26 00:00:00
12 1997-01-17 00:00:00
13 1982-08-31 00:00:00
14 1973-11-13 00:00:00
15 09/09/1980
Name: DATE OF BIRTH, dtype: object
PS: In the orginal project, all column headers capitals are mapped to lowers, so DATE OF BIRTH is equal to date of birth.
this is a another function where it has been executed successfully
def apply_multiple_conditions(
self, conditions, values, default, output_col_name, **kwargs
):
# print(conditions.replace("column", "self.df"), values)
self.df[output_col_name] = ""
conditions = eval(conditions.replace("column", "self.df"))
default = default.replace("column", "self.df")
values = eval(values)
self.df[output_col_name] = np.select(conditions, values, eval(default))
# print(self.df[output_col_name])
return self.df
and its json
{
"sheet_apply_multiple_conditions": {
"kwargs": {
"conditions": "[(column['home zip'].str.len() != 0) & (column['home zip'].str.len() > 30)]",
"values": "['Home zip code is greater than 5 digits. Please shorten to 5 zip codes.']",
"default": "str()",
"output_col_name": "homezip_len_validation"
}
}
}
and output of function where the code is excecuted successfully
STEP apply_multiple_conditions
After transformations
Empty DataFrame
Columns: [relationship, first name, last name, date of birth, age, gender (m or f), home zip, work zip, dental coverage status (ee, es, ec1, ec2, ef, we, ne), vision coverage status (ee, es, ec1, ec2, ef, we, ne), salary, salary mode, job title plumber, job title, occupation specialty, retiree (y or n), smoker (y or n), date of hire, std enrollment status, ltd enrollment status, basic life volume, supp life ee volume, supp life sp volume, supp life ch volume, vol life ee volume, vol life sp volume, vol life ch volume, basic ad&d volume, supp ad&d ee volume, supp ad&d sp volume, supp ad&d ch volume, vol ad&d tier, vol ad&d sp volume, vol ad&d ch volume, ltd voluntary volume, std voluntary volume, critical illness (ee, es, ec1, ec2, ef, we, ne), hospital plus insurance (ee, es, ec1, ec2, ef, we, ne), state, dental class number, vision class number, basic life class number, supp life class number, vol life class number, std class number, ltd class number, class number all products, occupation, vol ad&d ee volume, deductible insurance (ee, es, ec1, ec2, ef, we, ne), dental_coverage_present, dental_coverage_validation, vision_coverage_present, vision_coverage_validation, retiree_validation, smoker_validation, smoker_ci_validation, salarymode_validation, basiclifevol__neg_validation, basiclifevol__count_validation, supplifeeevol__neg_validation, supplifeeevol__count_validation, supplifespvol__neg_validation, supplifespvol__count_validation, supplifechvol__neg_validation, supplifechvol__count_validation, vollifeeevol__neg_validation, vollifeeevol__count_validation, vollifespvol__neg_validation, vollifespvol__count_validation, vollifechvol__neg_validation, vollifechvol__count_validation, basicad&dvol__neg_validation, basicad&dvol__count_validation, suppad&deevol__neg_validation, suppad&deevol__count_validation, suppad&dspvol__neg_validation, suppad&dspvol__count_validation, suppad&dchvol__neg_validation, suppad&dchvol__count_validation, volad&deevol__neg_validation, volad&deevol__count_validation, volad&dspvol__neg_validation, volad&dspvol__count_validation, volad&dchvol__neg_validation, volad&dchvol__count_validation, volltdvol__neg_validation, volltdvol__count_validation, volstdvol__neg_validation, volstdvol__count_validation, age_validation, gender_validation, homezip_int_validation, homezip_len_validation, workzip_int_validation, workzip_len_validation, occupation_character_validation, occupation_length_validation, job_title_character_validation, job_title_length_validation, ...]
Index: []
[0 rows x 119 columns]
CodePudding user response:
Yes, because of the way eval
works, essentially, since you are implicitly passing a different object for locals
and globals
(since it just gets globals()
and locals()
), the expression is being evaluated as if it were in a class definition, which does not create an enclosing scope. This is warned about explicitly in the docs:
If the locals dictionary is omitted it defaults to the globals dictionary. If both dictionaries are omitted, the expression is executed with the globals and locals in the environment where
eval()
is called. Note,eval()
does not have access to the nested scopes (non-locals) in the enclosing environment.
Here is another way to reproduce:
def foo():
data = 42
source = "def bar(): return data"
glbs = globals()
lcls = locals()
exec(source, glbs, lcls)
lcls['bar']()
And running in a REPL:
In [2]: foo()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-2-c19b6d9633cf> in <module>
----> 1 foo()
<ipython-input-1-63703f634871> in foo()
5 lcls = locals()
6 exec(source, glbs, lcls)
----> 7 lcls['bar']()
8
9
<string> in bar()
You could pass the same object to both paramaters, and then it will act as if it were exectuted in the global context, where enclosing scopes are available:
In [8]: def foo():
...: data = 42
...: source = "def bar(): return data"
...: glbs = locals()
...: lcls = glbs
...: exec(source, glbs, lcls)
...: return lcls['bar']()
...:
...:
In [9]: foo()
Out[9]: 42
But this is fundamentally a bad approach
You should almost certainly just not use eval
here. eval
is almost never the right approach.
CodePudding user response:
As said by the @juanpa.arrivillaga, those were free variables, thats why i have passed the external dictionary to indicate the gloabls and assosciate it to locals
symbols = {"self": self,"input_col":input_col,"pd":pd,"np":np}
eval(conditions,symbols)
or you can use
gbls = globals()
lcls = locals()
eval(conditions,lcls,gbls)
this should work