Home > OS >  Count the number of occurrences of specific words in a string
Count the number of occurrences of specific words in a string

Time:06-20

I have a string and I'm trying to count the number of times certain parts of it are repeated (the names). I thought it would be a good idea to turn the string into a list so I can count the names by index

from collections import Counter

string = "['673', 'andy', '05/05/16']['986', 'emma', '16/01/18']['147', 'david', '05/04/16']['996', 'nigel', '26/04/17']['209', 'emma', '04/03/17']['619', 'david', '18/07/18']['768', 'andy', '18/11/15']"

string_list = list(string)
print(Counter(string_list))

Output:

Counter({"'": 42, '1': 15, ',': 14, ' ': 14, '/': 14, '0': 10, '6': 9, '[': 7, ']': 7, '7': 6, 'a': 6, 'd': 6, '8': 6, '9': 5, '5': 4, 'm': 4, '4': 4, 'n': 3, 'e': 3, 'i': 3, '3': 2, 'y': 2, 'v': 2, '2': 2, 'g': 1, 'l': 1})

Good output:

andy: 2
emma: 2
david: 2
nigel: 1

CodePudding user response:

I would use ast.literal_eval with a bit of change on the string to load it as list (adding a comma after closing brackets), then use collections.Counter:

from ast import literal_eval
from collections import Counter

out = Counter(x[1] for x in literal_eval(string.replace(']', '],')))

Other, less robust, idea using a regex. Find the item that has both a comma before and after (second one) and feed to collections.Counter:

import re
from collections import Counter

out = Counter(m.group(1) for m in re.finditer(r", '([^'] )',", string))

output:

Counter({'andy': 2, 'emma': 2, 'david': 2, 'nigel': 1})
  • Related