Lazy evaluate Pandas dataframe filters-CodePudding

I'm observing a behavior that's weird to me, can anyone tell me how I can define filter once and re-use throughout my code?

>>> df = pd.DataFrame([1,2,3], columns=['A'])
>>> my_filter = df.A == 2
>>> df.loc[1] = 5
>>> df[my_filter]
   A
1  5

I expect my_filter to return empty dataset since none of the A columns are equal to 2.

I'm thinking about making a function that returns the filter and re-use that but is there any more pythonic as well as pandaic way of doing this?

def get_my_filter(df):
    return df.A == 2

df[get_my_filter()]
change df
df[get_my_filter()]

CodePudding user response：

Masks are not dynamic, they stay how you defined them when you defined them. So if you still need to change the dataframe value, you should swap lines 2 and 3. That would work.

CodePudding user response：

you applied the filter in the first place. Changing a value in the row won't help.

df = pd.DataFrame([1,2,3], columns=['A'])
my_filter = df.A == 2
print(my_filter)
'''
    A
0   False
1   True
2   False

'''

as you can see, it returns a series. If you change the data after this process, it will not work. because this represents the first version of the df. But you can use define filter as a string. You can achieve what you want if you use the string filter inside the eval() function.

df = pd.DataFrame([1,2,3], columns=['A'])
my_filter = 'df.A == 2'
df.loc[1] = 5
df[eval(my_filter)]

'''
Out[205]: 
Empty DataFrame
Columns: [A]
Index: []
'''