Show the word found from a set of words within a column-CodePudding

I'm trying to search through python for words in a list inside the rows of the dataframe to get two new columns showing the words found separated by commas and another column with the count of the words found

This is my string list

string_list = ["never sounded", "she", "was time", "against"]

and this is the df I want obtain

CodePudding user response：

First I separated by the string by word so you only find exact word matches, so if you search for something like the word "a", it doesn't just find every letter "a" in the string

wordsToFind = "beautiful sunny"
stringToSearch = "today will be a beautiful sunny day"

foundStrings = []
stringsToFind = wordsToFind.split()

for s in stringsToFind:
    list_stringSeparatedByWord = stringToSearch.lower().split()
    if list_stringSeparatedByWord.count(s.lower()) > 0:
        foundStrings.append(s)

print (foundStrings)

CodePudding user response：

Extending from Stephan's answer. Here is a declarative pythonic approach.

It sounds like you are trying to find the intersection of words you are looking for and words which exist in the text. You can achieve this using set intersection. https://docs.python.org/3.8/library/stdtypes.html#frozenset.intersection

Code:

text = "today will be a beautiful sunny day"
get_words = "beautiful sunny"
    
found_words = list(set(text.split(' ')).intersection(set(get_words.split(' '))))

Result:

found_words == ['beautiful', 'sunny']

In order to use this in pandas across multiple rows you can use df.assign. This will create a new column based on operations from current columns. https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.assign.html

Code:

get_words = "beautiful sunny"
word_finder_formatter = lambda row: ', '.join(list(set(row['text'].split(' ')).intersection(set(get_words.split(' ')))))

df = df.assign(found_words=word_finder)

Result:

text                                  | found_words
--------------------------------------------------------------
today will be a beautiful sunny day   | beautiful, sunny day