Counting word frequency in a sentence-CodePudding

I have two columns - one with sentences and the other with single words.

Sentence	word
"Such a day! It's a beautiful day out there"	"beautiful"
"Such a day! It's a beautiful day out there"	"day"
"I am sad by the sad weather"	"weather"
"I am sad by the sad weather"	"sad"

I want to count the frequency of the "word" column in the "sentence" column and achieve this output:

Sentence	word	n
"Such a day! It's a beautiful day out there"	"beautiful"	1
"Such a day! It's a beautiful day out there"	"day"	2
"I am sad by the sad weather"	"weather"	1
"I am sad by the sad weather"	"sad"	2

I tried:

ok = []
for l in [x.split() for x in df['Sentence']]:
    for y in df['word']:
        ok.append(l.count(y))

However it does NOT stop running and takes A VERY long time, so is not feasible for my actual dataset as it has 50k rows.

Anyone can help to achieve this?

CodePudding user response：

You can do it with zip

df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]: 
                                     Sentence       word  new
0  Such a day! It's a beautiful day out there  beautiful    1
1  Such a day! It's a beautiful day out there        day    2
2                 I am sad by the sad weather    weather    1
3                 I am sad by the sad weather        sad    2

CodePudding user response：

Try using pandas.apply:

df['n'] = df.apply(lambda r: r['Sentence'].count(r['word']), axis=1)

Result:

                                     Sentence       word  n
0  Such a day! It's a beautiful day out there  beautiful  1
1  Such a day! It's a beautiful day out there        day  2
2                 I am sad by the sad weather    weather  1
3                 I am sad by the sad weather        sad  2

CodePudding user response：

You can count string in a string using below code

# define string
string = "This is how you count same word of your defined string to another string using python"
substring = "string"

count = string.count(substring)

# print count
print(f"The count of the word {substring} is:", count)

Output: The count of the word string is: 2