Exception name self is not defined occurred in function <class, Method>-CodePudding

I have a function/method written inside a class:

def minimal_percentage(self, input_col=None, perc=5, conditions=None, **kwargs):
        print("name self")
        temp = ["" for i in self.df[input_col] if len(str(i)) == 0]
        print("after self")
        lendf = self.df[input_col].shape[0]
        print("2 self")
        if((len(temp)/lendf) < lendf*perc/100):
            conditions = conditions.replace("df","self.df")
            print(conditions)
            print(type(conditions))
            eval(conditions)

This function gets called internally by parsing the json and knowing the function name from it and calls.

 {
        "sheet_minimal_percentage":{
            "kwargs":{
                "input_col": "date of birth",
                "perc":5,
                "conditions": "df[input_col].map(lambda x: f'09/09/{np.array([pd.to_datetime(i).year for i in df[input_col][~df[input_col].isna()]]).sum()// (df[input_col].count())}' if (str(x) == 'nan' or str(x) == '') else pd.to_datetime(x))"
            }
        }
    }

I got the following:

output saying self is not defined occured in function class method

But the print statements after self and 2 self got executed without any error. When it came to the eval only, it is showing the error. Could you help me to resolve the error?

STEP minimal_percentage

name self
after self
2 self
self.df[input_col].map(lambda x: f'09/09/{np.array([pd.to_datetime(i).year for i in self.df[input_col][~self.df[input_col].isna()]]).sum()// (self.df[input_col].count())}' if (str(x) == 'nan' or str(x) == '') else pd.to_datetime(x))
<class 'str'>
Exception name 'self' is not defined occured in function <class 'method'>

When I loaded the dataframe from excel and checked the condition, it is executing correctly.

import pandas as pd
ddf= pd.read_excel("C:/Users/ShreeHarsha/Documents/test2.xlsx")

t = "ddf[input_col].map(lambda x: f'09/09/{np.array([pd.to_datetime(i).year for i in ddf[input_col][~ddf[input_col].isna()]]).sum()// (ddf[input_col].count())}' if (str(x) == 'nan' or str(x) == '') else pd.to_datetime(x))"

input_col = "DATE OF BIRTH"

import numpy as np

eval(st)

Got the output where empty string has to be replaced by 09/09/avg year of columns ,and here is output:

0     1983-06-11 00:00:00
1     1986-02-23 00:00:00
2     1998-03-25 00:00:00
3     1977-04-23 00:00:00
4     1981-04-28 00:00:00
5     1953-04-02 00:00:00
6     1992-01-23 00:00:00
7     1974-10-03 00:00:00
8     1961-06-30 00:00:00
9     1973-07-06 00:00:00
10    1977-10-25 00:00:00
11    2001-02-26 00:00:00
12    1997-01-17 00:00:00
13    1982-08-31 00:00:00
14    1973-11-13 00:00:00
15             09/09/1980
Name: DATE OF BIRTH, dtype: object

PS: In the orginal project, all column headers capitals are mapped to lowers, so DATE OF BIRTH is equal to date of birth.

this is a another function where it has been executed successfully

 def apply_multiple_conditions(
        self, conditions, values, default, output_col_name, **kwargs
    ):

        # print(conditions.replace("column", "self.df"), values)
        self.df[output_col_name] = ""
        conditions = eval(conditions.replace("column", "self.df"))
        default = default.replace("column", "self.df")
        values = eval(values)
        self.df[output_col_name] = np.select(conditions, values, eval(default))
        # print(self.df[output_col_name])
        return self.df

and its json

 {
        "sheet_apply_multiple_conditions": {
            "kwargs": {
                "conditions": "[(column['home zip'].str.len() != 0) & (column['home zip'].str.len() > 30)]",
                "values": "['Home zip code is greater than 5 digits.  Please shorten to 5 zip codes.']",
                "default": "str()",
                "output_col_name": "homezip_len_validation"
            }
        }
    }

and output of function where the code is excecuted successfully

STEP apply_multiple_conditions


After transformations
Empty DataFrame
Columns: [relationship, first name, last name, date of birth, age, gender (m or f), home zip, work zip, dental coverage status (ee, es, ec1, ec2, ef, we, ne), vision coverage status (ee, es, ec1, ec2, ef, we, ne), salary, salary mode, job title plumber, job title, occupation specialty, retiree (y or n), smoker (y or n), date of hire, std enrollment status, ltd enrollment status, basic life volume, supp life ee volume, supp life sp volume, supp life ch volume, vol life ee volume, vol life sp volume, vol life ch volume, basic ad&d volume, supp ad&d ee volume, supp ad&d sp volume, supp ad&d ch volume, vol ad&d tier, vol ad&d sp volume, vol ad&d ch volume, ltd voluntary volume, std voluntary volume, critical illness (ee, es, ec1, ec2, ef, we, ne), hospital plus insurance (ee, es, ec1, ec2, ef, we, ne), state, dental class number, vision class number, basic life class number, supp life class number, vol life class number, std class number, ltd class number, class number all products, occupation, vol ad&d ee volume, deductible insurance (ee, es, ec1, ec2, ef, we, ne), dental_coverage_present, dental_coverage_validation, vision_coverage_present, vision_coverage_validation, retiree_validation, smoker_validation, smoker_ci_validation, salarymode_validation, basiclifevol__neg_validation, basiclifevol__count_validation, supplifeeevol__neg_validation, supplifeeevol__count_validation, supplifespvol__neg_validation, supplifespvol__count_validation, supplifechvol__neg_validation, supplifechvol__count_validation, vollifeeevol__neg_validation, vollifeeevol__count_validation, vollifespvol__neg_validation, vollifespvol__count_validation, vollifechvol__neg_validation, vollifechvol__count_validation, basicad&dvol__neg_validation, basicad&dvol__count_validation, suppad&deevol__neg_validation, suppad&deevol__count_validation, suppad&dspvol__neg_validation, suppad&dspvol__count_validation, suppad&dchvol__neg_validation, suppad&dchvol__count_validation, volad&deevol__neg_validation, volad&deevol__count_validation, volad&dspvol__neg_validation, volad&dspvol__count_validation, volad&dchvol__neg_validation, volad&dchvol__count_validation, volltdvol__neg_validation, volltdvol__count_validation, volstdvol__neg_validation, volstdvol__count_validation, age_validation, gender_validation, homezip_int_validation, homezip_len_validation, workzip_int_validation, workzip_len_validation, occupation_character_validation, occupation_length_validation, job_title_character_validation, job_title_length_validation, ...]
Index: []

[0 rows x 119 columns]

CodePudding user response：

Yes, because of the way eval works, essentially, since you are implicitly passing a different object for locals and globals (since it just gets globals() and locals()), the expression is being evaluated as if it were in a class definition, which does not create an enclosing scope. This is warned about explicitly in the docs:

If the locals dictionary is omitted it defaults to the globals dictionary. If both dictionaries are omitted, the expression is executed with the globals and locals in the environment where eval() is called. Note, eval() does not have access to the nested scopes (non-locals) in the enclosing environment.

Here is another way to reproduce:

def foo():
    data = 42
    source  = "def bar(): return data"
    glbs = globals()
    lcls = locals()
    exec(source, glbs, lcls)
    lcls['bar']()

And running in a REPL:

In [2]: foo()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-2-c19b6d9633cf> in <module>
----> 1 foo()

<ipython-input-1-63703f634871> in foo()
      5     lcls = locals()
      6     exec(source, glbs, lcls)
----> 7     lcls['bar']()
      8
      9

<string> in bar()

You could pass the same object to both paramaters, and then it will act as if it were exectuted in the global context, where enclosing scopes are available:

In [8]: def foo():
   ...:     data = 42
   ...:     source  = "def bar(): return data"
   ...:     glbs = locals()
   ...:     lcls = glbs
   ...:     exec(source, glbs, lcls)
   ...:     return lcls['bar']()
   ...:
   ...:

In [9]: foo()
Out[9]: 42

But this is fundamentally a bad approach

You should almost certainly just not use eval here. eval is almost never the right approach.

CodePudding user response：

As said by the @juanpa.arrivillaga, those were free variables, thats why i have passed the external dictionary to indicate the gloabls and assosciate it to locals

symbols = {"self": self,"input_col":input_col,"pd":pd,"np":np}
            eval(conditions,symbols)

or you can use

gbls = globals()
lcls = locals()
eval(conditions,lcls,gbls)

this should work