Home > Software engineering >  Short way to filter a Pandas dataframe by a fixed condition using a list of keys
Short way to filter a Pandas dataframe by a fixed condition using a list of keys

Time:03-30

Let's say I have the below dataframe:

    Column1 Column2 Column3 Column4
1.  1       0       1       0
2.  0       1       1       0
3.  1       0       0       0
4.  0       1       0       1
5.  1       1       0       1

I want to filter out all the rows in which the values in Column1 are 1 and other columns are 0.

I know that I can do something like this:

my_df = df[df['Column1'] == 1 &  df['Column2'] == 0 df['Column3'] == 0 df['Column4'] == 0]

I wonder if there is a way to shorten this so that I can make a function for this filtering logic? Something like:

def my_filter(df, column):
    other_columns = [item for item in df.columns if item != column]
    return df[df[column] == 1 & df[other_column] == 0 for other_column in other_columns]

I've tried the above but this syntax doesn't work. Any pointers would be appreciated.

CodePudding user response:

You can compare all columns without Column1 with DataFrame.all for test if all Trues:

my_df = df[(df['Column1'] == 1) &  (df.drop('Column1', 1) == 0).all(1)]
print (my_df)
     Column1  Column2  Column3  Column4
3.0        1        0        0        0

CodePudding user response:

Use:

cols_to_check = ['Column1', 'Column2', 'Column3', 'Column4']
vals_to_check = [1, 0, 0, 0]

out = df[df[cols_to_check].eq(vals_to_check).all(axis=1)]
print(out)

# Output
   Column1  Column2  Column3  Column4
3.       1        0        0        0
  • Related