Home > Enterprise >  How to match multiple words from list with pandas data frame column
How to match multiple words from list with pandas data frame column

Time:12-09

I have a list like :

keyword_list = ['motorcycle love hobby ', 'bike love me', 'cycle', 'dirtbike cycle motorbike ']

I want to find these words in the panda's data frame column and if 3 words match then it should create a new column with these words.

I need something like this :

enter image description here

CodePudding user response:

You can probably use set operations:

kw = {s: set(s.split()) for s in keyword_list}

def subset(s):
    S1 = set(s.split())
    for k, S2 in kw.items():
        if S2.issubset(S1):
            return k

df['trigram'] = [subset(s) for s in df['description'].str.lower()]

print(df)

Output:

                                   description                 trigram
0  I love motorcycle though I have other hobby   motorcycle love hobby 
1                                  I have bike                    None
  • Related