hello I'm trying to filter out df column that are contained in another dict
here is the sample:
import random
df = pd.DataFrame({'type': random.choice(['222', '44']), #let size be 10k for example
'method': random.choice(['open', 'close'])})
filter_dict = {'type': {0: ['44']}, 'method': {0: ['open', 'closed']}}
it works fine with df[df['method'].isin(filter_dict['method'][0])]
when i am trying to filter next column like this, i got an empty dataframe
df[df['type'].isin(filter_dict['type'][0])]
idk why it is happening, ty for ur answers
pandas version is 0.23.4
CodePudding user response:
You example is invalid (you have a unique value with random.choice
, thus the pandas error), but else, the code is working as expected:
import numpy as np
np.random.seed(0)
df = pd.DataFrame({'type': np.random.choice(['222', '44'], size=20),
'method': np.random.choice(['open', 'close'], size=20)})
filter_dict = {'type': {0: ['44']}, 'method': {0: ['open', 'closed']}}
df[df['type'].isin(filter_dict['type'][0])]
output:
type method
1 44 close
2 44 close
4 44 open
5 44 close
6 44 close
7 44 close
8 44 close
9 44 open
10 44 close
13 44 open
19 44 open
If you want to test the condition on all columns:
mask = np.all([df[c].isin(filter_dict[c][0]) for c in df.columns], axis=0)
df[mask]
output:
type method
4 44 open
9 44 open
13 44 open
19 44 open