Home > Mobile >  Find exact match in column of list
Find exact match in column of list

Time:10-13

How could I create exactly_in_colb? It checks if any item in lookfor is exactly in colb

lookfor = ["ap", "bana", "pear", "taste"]

df   col      colb
0   apple     ["app", "pl"]
1   banana    ["bana"]
2   pear      ["pear"]

Expected Output

df   col      colb                    exactly_in_colb
0   apple     ["app", "pl"]      
1   banana    ["bana", "yellow"]        ["bana"]
2   pear      ["pear", "taste"]        ["pear", "taste"]

Pandas thinks colb is an np.array. If I convert it to string, the code runs but won't look for exact matches.

CodePudding user response:

Use set.intersection:

lookfor = ["ap", "bana", "pear", "taste"]

df['exactly_in_colb'] = df['colb'].apply(lambda x: list(set(x).intersection(lookfor)))
print (df)
   df     col           colb exactly_in_colb
0   0   apple      [app, pl]              []
1   1  banana         [bana]          [bana]
2   2    pear  [pear, taste]   [pear, taste]

CodePudding user response:

You need a list comprehension and set lookup (for efficiency):

lookfor = ["ap", "bana", "pear", "taste"]
# make a set for efficient lookup
S = set(lookfor)

# looping through list to keep original order
df['exactly_in_colb'] = [[v for v in l if v in S] for l in df['colb']]

output:

      col            colb  exactly_in_colb
0   apple       [app, pl]               []
1  banana          [bana]           [bana]
2    pear   [pear, taste]    [pear, taste]
  • Related