Home > Enterprise >  check frequency of keyword in df in a text
check frequency of keyword in df in a text

Time:11-25

I have a given text string: text = """Alice has two apples and bananas. Apples are very healty."""

and a dataframe:

word
apples
bananas
company

I would like to add a column "frequency" which will count occurrences of each word in column "word" in the text.

So the output should be as below:

word frequency
apples 2
bananas 1
company 0

CodePudding user response:

import pandas as pd
df = pd.DataFrame(['apples', 'bananas', 'company'], columns=['word'])
para = "Alice has two apples and bananas. Apples are very healty.".lower()
df['frequency'] = df['word'].apply(lambda x : para.count(x.lower()))

    word    frequency
0   apples  2
1   bananas 1
2   company 0

CodePudding user response:

  1. Convert the text to lowercase and then use regex to convert it to a list of words. You might check out this page for learning purposes.
  2. Loop through each row in the dataset and use lambda function to count the specific value in the previously created list.
# Import and create the data
import pandas as pd
import re
text = """Alice has two apples and bananas. Apples are very healty."""
df = pd.DataFrame(data={'word':['apples','bananas','company']})

# Solution
words_list = re.findall(r'\w ', text.lower())
df['Frequency'] = df['word'].apply(lambda x: words_list.count(x))
  • Related