Home > database >  Meta programming with dictionary comprehensions
Meta programming with dictionary comprehensions

Time:12-04

So I have a bunch of functions I want to create with different parameters. One of the parameters df will be provided by the caller of these functions. I thought I had it figured out but when I actually used it every function created had the same parameters, the last combination in the list comprehension sequence. weird.

from itertools import product
feature_functions = {
    **{f'{col}{i}': lambda x: createFeature(df=x, i=i, col=col, name=f'{col}{i}')
        for col, i in product(['New', 'Lost', 'Change'], list(range(1, 31)))},

like I said, I thought this was pretty slick but when I used it like so:

feature_functions['New1'](df)

I got this result, meaning it was using the 'Change' and 30 for each lambda function:

# feature pd.Series:
0     NaN
      ...   
4593  1.002706
Name: Change30, Length: 4594, dtype: float64

I tried several things, but nothing changed. How am I using this dictionary comprehension wrong?

EDIT: By the way, one thing that I did to verify it was right, was put the lambda x: ... in quotes, then I could just print it all out and it looked pretty good. so, somehow the lambda is getting in the way? I did try wrapping it in (lambda x: ...) but that did nothing.

{'New1': "lambda x: createFeature(df=x, i=1, col=New, name='New1')",
 'New2': "lambda x: createFeature(df=x, i=2, col=New, name='New2')",
 'New3': "lambda x: createFeature(df=x, i=3, col=New, name='New3')",
 'New4': "lambda x: createFeature(df=x, i=4, col=New, name='New4')",
 ... 
}

CodePudding user response:

You are really close. Here, don't use lambda but partial function from functools module:

# dummy function
def createFeature(df, i, col, name):
    print(df)
    print(i, col, name)

feature_functions = {
    **{f'{col}{i}': partial(createFeature, i=i, col=col, name=f'{col}{i}')
        for col, i in product(['New', 'Lost', 'Change'], list(range(1, 31)))}}

Usage:

>>> feature_functions['New1'](pd.DataFrame)
Empty DataFrame
Columns: []
Index: []
1 New New1

>>> feature_functions['Lost23'](pd.DataFrame())
Empty DataFrame
Columns: []
Index: []
23 Lost Lost23

>>> feature_functions['Change12'](pd.DataFrame())
Empty DataFrame
Columns: []
Index: []
12 Change Change12

CodePudding user response:

Ok, this is very interesting. If you create a function that returns your lambda, such as:

def createFeatureCreator(i, col):
   return lambda x: createFeature(df=x, i=i, col=col, name=f'{col}{i}')

and do your comprehension (I removed a lot of spurious things you had):

feature_functions = {f'{col}{i}': createFeatureCreator(i, col)
        for col, i in product(['New', 'Lost', 'Change'], range(1, 31))}

it works as you would expect.

The reason why the "lambda" construct does not work directly is actually very interesting: a lambda captures the environment. The dict comprehension is a single environment, where the variables i and col change at each iteration of the loop. When the lambda is created (and indeed 93 different lambdas are created), they all capture the same environment, thus when they are executed the values of i and col are the last value that they had in the environment (the f-string expands to a function call, that is not executed because it is inside the lambda, and it is only executed when you actually call the function, that's why name also appears to be "wrong").

  • Related