Let's say I have the below dataframe:
Column1 Column2 Column3 Column4
1. 1 0 1 0
2. 0 1 1 0
3. 1 0 0 0
4. 0 1 0 1
5. 1 1 0 1
I want to filter out all the rows in which the values in Column1 are 1 and other columns are 0.
I know that I can do something like this:
my_df = df[df['Column1'] == 1 & df['Column2'] == 0 df['Column3'] == 0 df['Column4'] == 0]
I wonder if there is a way to shorten this so that I can make a function for this filtering logic? Something like:
def my_filter(df, column):
other_columns = [item for item in df.columns if item != column]
return df[df[column] == 1 & df[other_column] == 0 for other_column in other_columns]
I've tried the above but this syntax doesn't work. Any pointers would be appreciated.
CodePudding user response:
You can compare all columns without Column1
with DataFrame.all
for test if all True
s:
my_df = df[(df['Column1'] == 1) & (df.drop('Column1', 1) == 0).all(1)]
print (my_df)
Column1 Column2 Column3 Column4
3.0 1 0 0 0
CodePudding user response:
Use:
cols_to_check = ['Column1', 'Column2', 'Column3', 'Column4']
vals_to_check = [1, 0, 0, 0]
out = df[df[cols_to_check].eq(vals_to_check).all(axis=1)]
print(out)
# Output
Column1 Column2 Column3 Column4
3. 1 0 0 0