Home > front end >  Return substring if present in a string and match with case insensitive python
Return substring if present in a string and match with case insensitive python

Time:10-20

I am currently trying to return a substring if is present in a string, with case insensitive.

So an example would be, I want to return the string "apple" even when the sentence is "Apple is cool" or "I like APPLE" or "I like apples"

What I have so far is this:

df_word_list = pd.DataFrame({'word':  ['apple','cool']})
df= pd.DataFrame({'sentence':  ['"Apple is cool","I like APPLE","I like apples"]})

words = [x for x in df_word_list['word'].tolist() if x in str(df['sentence'][i])]

This gives me the returned words, but it's case sensitive, anyone knows how to turn it into case insensitive?

I would like the final output to be

  1. apple, cool
  2. apple

Row 3 is empty because it has an "s" ("apples" instead of "apple")

df_words_list is the dataframe of words that I want to identify. df is the dataframe that contains the sentences.

CodePudding user response:

df.sentence.str.lower().str.split().apply(lambda l: ", ".join([x for x in l if x in df_word_list["word"].values]))

result is pandas.Series of strings

0    apple, cool
1          apple
2              
Name: sentence, dtype: object
  • Related