Home > Enterprise >  Getting specific characters from a string and appending them to a list
Getting specific characters from a string and appending them to a list

Time:12-19

I have a string that follows:

my_string = "Jack could not account for his foolish mistake. Computers account for 5% of the country'scommercial electricity consumption. It's time to focus on the company's core business. The company needs to focus on its biggest clients..."

and a list:

Phrasal_Verbs = [ "account for", "ache for", "act on", "act out", "act up", "add on", "focus on" ...]

I want to find how many occurences each phrasal verbs has in the string with counter module and delete the phrasal verb from the string. My code so far is:

phrasal_verbs_list = []
for pv in Phrasal_Verbs:
if pv in my_string:
    phrasal_verbs_list.append(pv)
    my_string = my_string.replace(pv, "")
pv_count = dict(Counter(phrasal_verbs_list))

The code above finds all of the phrasal verbs but even if there are three 'Account for's in the string, it gives me only one.

Expected pv_count = {'account for' : 2, 'focus on' : 2, rest_of_the_phrasal_verbs : occurrences }

Got = {'account for': 1, 'act out': 1, 'allow for': 1, 'be in': 1, 'be on': 1, 'blow down': 1, ... 'focus on' : 1}

CodePudding user response:

You could simply do this

pv_count = {string: my_string.count(string) for string in Phrasal_Verbs}

And then, if you want to remove the phrasal verbs from the string:

import re
text = re.sub("|".join(pv_count.keys()), "", my_string)

You could also call Phrasal_Verbs directly instead of `pv_count.keys() pointed out in the comments.

CodePudding user response:

This happens because replace replaces all occurrences of that string and therefore, the remaining occurrences won't be counted resulting in a count of 1.

Use my_string.replace(pv, "", 1) to only replace the first occurrence. That should fix your problem.

  • Related