I have two columns - one with sentences and the other with single words.
Sentence | word |
---|---|
"Such a day! It's a beautiful day out there" | "beautiful" |
"Such a day! It's a beautiful day out there" | "day" |
"I am sad by the sad weather" | "weather" |
"I am sad by the sad weather" | "sad" |
I want to count the frequency of the "word" column in the "sentence" column and achieve this output:
Sentence | word | n |
---|---|---|
"Such a day! It's a beautiful day out there" | "beautiful" | 1 |
"Such a day! It's a beautiful day out there" | "day" | 2 |
"I am sad by the sad weather" | "weather" | 1 |
"I am sad by the sad weather" | "sad" | 2 |
I tried:
ok = []
for l in [x.split() for x in df['Sentence']]:
for y in df['word']:
ok.append(l.count(y))
However it does NOT stop running and takes A VERY long time, so is not feasible for my actual dataset as it has 50k rows.
Anyone can help to achieve this?
CodePudding user response:
You can do it with zip
df['new'] = [x.count(y) for x, y in zip(df.Sentence,df.word)]
df
Out[419]:
Sentence word new
0 Such a day! It's a beautiful day out there beautiful 1
1 Such a day! It's a beautiful day out there day 2
2 I am sad by the sad weather weather 1
3 I am sad by the sad weather sad 2
CodePudding user response:
Try using pandas.apply
:
df['n'] = df.apply(lambda r: r['Sentence'].count(r['word']), axis=1)
Result:
Sentence word n
0 Such a day! It's a beautiful day out there beautiful 1
1 Such a day! It's a beautiful day out there day 2
2 I am sad by the sad weather weather 1
3 I am sad by the sad weather sad 2
CodePudding user response:
You can count string in a string using below code
# define string
string = "This is how you count same word of your defined string to another string using python"
substring = "string"
count = string.count(substring)
# print count
print(f"The count of the word {substring} is:", count)
Output: The count of the word string is: 2