I am having trouble applying filters with pandas. The problem looks like this. The first variable in the set (filter_names) should correspond to the first variable in the set (filter_values). The value of the second variable should be bigger or equal to the value given. In other words, in the input like this:
df = pd.DataFrame({'animal': ['cat', 'cat', 'snake', 'dog', 'dog', 'cat', 'snake', 'cat', 'dog', 'dog'],
'age': [2.5, 3, 0.5, np.nan, 5, 2, 4.5, np.nan, 7, 3],
'name': ['Murzik', 'Pushok', 'Kaa', 'Bobik', 'Strelka', 'Vaska', 'Kaa2', 'Murka', 'Graf', 'Muhtar'],
'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'priority': ['yes', 'yes', 'no', 'yes', 'no', 'no', 'no', 'yes', 'no', 'no']},
index = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'])
filter_names = ["animal", "age"]
filter_values = ["cat", 3]
the condition to be put in the query looks like this: "cat"=="animal", "age"<3.
It should provide the DF below:
animal age name visits priority
a cat 2.5 Murzik 1 yes
f cat 2.0 Vaska 3 no
I wrote the following code to achieve this effect:
df_filtered = df[(filter_names[0]==filter_values[0])&(df[filter_names[1]]>=filter_values[1])]
to no avail. What do I seem to be missing?
CodePudding user response:
I think you lost df[...]
in the first condition and use the wrong sign in the second one:
df[(df[filter_names[0]] == filter_values[0]) & (df[filter_names[1]] < filter_values[1])]
It will work like this:
In [2]: df[(df[filter_names[0]] == filter_values[0]) & (df[filter_names[1]] < filter_values[1])]
Out[2]:
animal age name visits priority
a cat 2.5 Murzik 1 yes
f cat 2.0 Vaska 3 no