I have a string that follows:
my_string = "Jack could not account for his foolish mistake. Computers account for 5% of the country'scommercial electricity consumption. It's time to focus on the company's core business. The company needs to focus on its biggest clients..."
and a list:
Phrasal_Verbs = [ "account for", "ache for", "act on", "act out", "act up", "add on", "focus on" ...]
I want to find how many occurences each phrasal verbs has in the string with counter module and delete the phrasal verb from the string. My code so far is:
phrasal_verbs_list = []
for pv in Phrasal_Verbs:
if pv in my_string:
phrasal_verbs_list.append(pv)
my_string = my_string.replace(pv, "")
pv_count = dict(Counter(phrasal_verbs_list))
The code above finds all of the phrasal verbs but even if there are three 'Account for's in the string, it gives me only one.
Expected pv_count = {'account for' : 2, 'focus on' : 2, rest_of_the_phrasal_verbs : occurrences }
Got = {'account for': 1, 'act out': 1, 'allow for': 1, 'be in': 1, 'be on': 1, 'blow down': 1, ... 'focus on' : 1}
CodePudding user response:
You could simply do this
pv_count = {string: my_string.count(string) for string in Phrasal_Verbs}
And then, if you want to remove the phrasal verbs from the string:
import re
text = re.sub("|".join(pv_count.keys()), "", my_string)
You could also call Phrasal_Verbs
directly instead of `pv_count.keys() pointed out in the comments.
CodePudding user response:
This happens because replace
replaces all occurrences of that string and therefore, the remaining occurrences won't be counted resulting in a count of 1.
Use my_string.replace(pv, "", 1)
to only replace the first occurrence. That should fix your problem.