I have a dataframe with multiple columns. I want to filter the dataframe based on two columns. For one column there is one condition and for the other column there are 3 conditions.
This code for the one condition in each column works fine:
filtered_df = df[(df['col1'] == 'cond1') & (df['col2'] == 'cond2)]
When I expand this to 3 conditions for the second column I get an empty dataframe as a result.
filtered_df = df[(df['col1'] == 'cond1') & (df['col2'] == 'cond2) & (df['col2'] == 'cond3) & (df['col2'] == 'cond4)]
How can I filter with more than 2 conditions?
CodePudding user response:
(df['col2'] == 'cond2') & (df['col2'] == 'cond3')
There couldn't be a value that both equal to cond2
and equal to cond3
at the same time. You may want |
Better is to use isin
for a list of value
df['col2'].isin(['cond2', 'cond3'])
CodePudding user response:
filtered_df = df[(df['col1'] == 'cond1') & (df['col2'] == 'cond2) & (df['col2'] == 'cond3) & (df['col2'] == 'cond4)]
please be careful... in your code there are four condition that contradictory:
col2
can not be cond2
and cond3
and cond4
at same time
so you must use or
:
filtered_df = df[(df['col1'] == 'cond1') & ((df['col2'] == 'cond2') | (df['col2'] == 'cond3') | (df['col2'] == 'cond4'))]
CodePudding user response:
It's correct but there is missing ' in your code, try this:
filtered_df = df.loc[(df['col1'] == 'cond1') & (df['col2'] == 'cond2') & (df['col2'] == 'cond3') & (df['col2'] == 'cond4')]
CodePudding user response:
You can use pandas.query
.
filtered_df = df.query('(col2=="cond2" or col2=="cond3" or col2=="cond4") and (col1 == "cond1")')