Home > Enterprise >  Filter row based data on multiple criterias
Filter row based data on multiple criterias

Time:02-18

I have the following row based dataset in pandas and want to filter modules that fulfill multiple critera, e.g. all modules with (setting='Foo' and value=2) and (setting='bar' and value=1)

Currently i am getting seperatly the modules for each criteria and then building the intersection of all single results. Is there a more efficient way, e.g. to transfer the dataset into a column based dataset and then filtering with one query (setting_foo=2 and settting_bar=3)

module    setting   value
-----------------------------------------
A         Foo       2
A         bar       1
B         Foo       3
B         bar       1
...

Desired output for this example would be module A because module B fulfills bar=1 but not foo=2

CodePudding user response:

You can use .pivot to make columns for Foo and Bar for each module, as below:

df_pivot = df.pivot(index='module', columns='setting', values='value').reset_index()

Gives:

setting module  Foo  bar
0            A    2    1
1            B    3    1

Then you can use .loc to filter where Foo=2 and bar=1, and filter on the module column, as below:

df_pivot.loc[(df_pivot['Foo'] == 2) & (df_pivot['bar'] == 1), 'module']

Which outputs:

0    A
Name: module, dtype: object
  • Related