I have a list of keywords which I wish to match in a list of sentences. If found within that sentence than return the found keyword in a list.
What I have tried:
sentence = df['List of Content']
list_of_words = ['keyword1','keyword2', 'keyword3']
This below works if I choose only one row:
[word for word in list_of_words if word in sentence[0]
and outputs
output: ['keyword1', 'keyword3']
The desirable output for all the rows, is a list of keywords that match in the sentence. Something like this:
matching_keywords = [['keyword1', 'keyword3'],['keyword2, 'keyword3'],['keyword1', 'keyword2']..]
However, when I run the for
loop in the entire list it just outputs an empty list []
I have also tried a nested for loop:
kwords = []
for row in MCC:
for x in list_of_words:
if x in row:
kwords.append(x)
It either gives me an empty bracket list again []
or it just creates a long list of the keywords repeating themselves.
What is the mistake am I making? Anyone can try to help me with the logic/solution.
CodePudding user response:
You could extend your initial approach by doing the following.
[[word for word in list_of_words if word in row] for row in sentence]
Explanation: This amounts to nested list comprehension. For each row, we want a list of keywords that appear in that row. With list comprehension, this should be written as
[<list of keywords in row> for row in sentence]
On the other hand, if you have a specific row that you're looking at (for instance, row = sentence[0]
), then as you state in your question the list of keywords that appear in this row can be obtained with [word for word in list_of_words if word in row]
. Putting this together leads to the result I wrote above, namely
[[word for word in list_of_words if word in row] for row in sentence]
CodePudding user response:
Because you have pandas you can use apply
like below:
df['List of Content'].apply(lambda x : [i for i in x.split() if i in list_of_words]).tolist()