Home > OS >  Filter a pandas dataframe columns and rows using values from a dict
Filter a pandas dataframe columns and rows using values from a dict

Time:05-27

I need to filter a data frame with a dictionary, constructed with the key being the column name and the list of value being the value that I want to filter:

dict_filter = {'A':'high', 'B':'medium', 'C':['bottom', 'high']}
# the dataframe is like
df = pd.DataFrame({'id':[1,2], 'A':['high', 'high'], 'B':['high','medium'],'C':['high','bottom']})

the dataframe is like

    id      A          B          C   
0   1     'high'     'high'     'high'
1   1     'high'    'medium'    'bottom'

I would like to get a dataframe filtered as following:

    id      A          B          C   
1   1     'high'    'medium'    'bottom'

I tried the following method but it doesnt work with the fact that the last value of the dictionary is a list

df.loc[(df[list(dict_filter)] == pd.Series(dict_filter )).all(axis=1)]

Any suggestions ?

CodePudding user response:

bool_arr = []
for k, v in dict_filter.items():
    bool_arr.append(df.loc[:, k].isin(pd.Series(v)))
df.loc[pd.concat(bool_arr, axis=1).all(axis=1)]
#    id     A       B       C
# 1   2  high  medium  bottom

CodePudding user response:

Solution

We can use isin to create a boolean mask, but before that you have to make sure that all the values in the dict_filter are list of strings

d = {k: np.atleast_1d(v) for k, v in dict_filter.items()}
df[df[list(d)].isin(d).all(1)]

   id     A       B       C
1   2  high  medium  bottom

CodePudding user response:

One-liner:

filtered = df[df.apply(lambda col: col.isin(pd.Series(dict_filter.get(col.name, [])))).all(axis=1)]

Output:

>>> filtered
       A       B       C
id                      
2   high  medium  bottom

CodePudding user response:

You can use:

d = {k:v if isinstance (v, list) else [v]
    for k,v in dict_filter.items()}

mask = (df[list(dict_filter)]
 .apply(lambda c: c.isin(d[c.name]))
 .all(1)
 )

df2 = df[mask]

Output:

   id     A       B       C
1   2  high  medium  bottom
  • Related