I have a list like :
keyword_list = ['motorcycle love hobby ', 'bike love me', 'cycle', 'dirtbike cycle motorbike ']
I want to find these words in the panda's data frame column and if 3 words match then it should create a new column with these words.
I need something like this :
CodePudding user response:
You can probably use set operations:
kw = {s: set(s.split()) for s in keyword_list}
def subset(s):
S1 = set(s.split())
for k, S2 in kw.items():
if S2.issubset(S1):
return k
df['trigram'] = [subset(s) for s in df['description'].str.lower()]
print(df)
Output:
description trigram
0 I love motorcycle though I have other hobby motorcycle love hobby
1 I have bike None