Home > Mobile >  Filtering a pandas dataframe based of list of lists
Filtering a pandas dataframe based of list of lists

Time:10-14

I have a pandas dataframe with a column of lists and I'm trying to filter it out based on another list of lists.

id           path
101     ['Activities (DEV)', 'public', '_yoyo_log']
102     ['Activities (DEV)', 'public', 'behavior_trackers']
103     ['Activities (DEV)', 'public', 'journal_entries']
104     ['Social (PROD)', 'public', 'starva_activity']
105     ['pg-prd (DEV-RR)', 'public', 'activities']
106     ['pg-prd (DEV-RR)', 'public', 'blackouts']

And a list of lists

slist = [['activities (dev)', 'public', 'behavior_trackers'],
        ['activities (dev)', 'public', 'journal_entries'],
        ['pg-prd (dev-rr)', 'public', 'activities']]

What I am trying to do is filtering out pandas dataframe based off the list values. This is what I tried:

df = df[df['path'].apply(lambda x: eval(str(x).lower())).isin(slist)]

This approach works sometimes sometimes and most of the times it throws an error saying

TypeError: unhashable type: 'list'

I want my output to be like

id           path
102     ['Activities (DEV)', 'public', 'behavior_trackers']
103     ['Activities (DEV)', 'public', 'journal_entries']
105     ['pg-prd (DEV-RR)', 'public', 'activities']

Is there a better way to do that or am I missing something? I am using pyenv 3.6.2

CodePudding user response:

Use tuples for filtering in both - column and also convert list to tuples:

t = [tuple(x) for x in slist]
df = df[df['path'].apply(lambda x: tuple(eval(str(x).lower()))).isin(t)]

Or:

df = df[df['path'].apply(lambda x: tuple([y.lower() for y in x])).isin(t)]

print (df)
    id                                           path
1  102  [Activities (DEV), public, behavior_trackers]
2  103    [Activities (DEV), public, journal_entries]
4  105          [pg-prd (DEV-RR), public, activities]
  • Related