Home > front end >  Retaining records in one text row based on keywords in another text row pandas dataframe
Retaining records in one text row based on keywords in another text row pandas dataframe

Time:12-19

The dataframe below has two text columns, text with sentences and keyword that has a list of keywords using which I want to filter the text column enter image description here

I'm trying to filter the text on the condition of keyword column. If any of the words in the keyword column exist in text column, we retain that row and if not we drop it.

The output dataframe should look like this. enter image description here

I tried using str.contains() function in pandas which is incorrect as contains() function is looking for regex/pattern.

df['text'].str.contains(df['keyword'].str)

I got the below error

TypeError: first argument must be string or compiled pattern

CodePudding user response:

With builtin any function (to check if any of the list of keywords occurs within a text):

df = df[df.apply(lambda x: any(k in x.text for k in x.keyword), axis=1)]
  • Related