I need to filter a pandas dataframe with two boolean queries, means I want to keep the ones which are True
dataframe:
import numpy as np
df = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
columns=['a', 'b', 'c'])
output:
a b c
0 1 2 3
1 4 5 6
2 7 8 9
single filter works:
filter = (df.b == 2)
df = df[filter]
output:
a b c
0 1 2 3
But how can I filter with df.b == 2
or df.b == 5
?
I tried:
filter = [(df['b']==2) | (df['b']==5)]
df = df[filter]
print(df)
I get :
ValueError: Item wrong length 1 instead of 3
Any suggestions how do achive it?
my desired output is:
a b c
0 1 2 3
1 4 5 6
CodePudding user response:
You pass list
as filter, try this: (better don't use filter
as variable, it is built-in function
in python)
mask = ((df['b']==2) | (df['b']==5))
df = df[mask]
You can use .inin()
as alternative solution like below:
mask = [2,5]
df = df[df['b'].isin(mask)]