Python - Counting Letter Frequency in a String-CodePudding

I want to write my each string's letter frequencies. My inputs and expected outputs are like this.

"aaaa" -> "a4"
"abb" -> "a1b2"
"abbb cc a" -> "a1b3 c2 a1"
"bbbaaacddddee" -> "b3a3c1d4e2"
"a   b" -> "a1 b1"

I found this solution but it gives the frequencies in random order. How can I do this?

CodePudding user response：

Does this satisfy your needs?

from itertools import groupby
s = "bbbaaac ddddee aa"

groups = groupby(s)
result = [(label, sum(1 for _ in group)) for label, group in groups]
res1 = "".join("{}{}".format(label, count) for label, count in result)
# 'b3a3c1 1d4e2 1a2'

# spaces just as spaces, do not include their count
import re
re.sub(' [0-9] ', ' ', res1)
'b3a3c1 d4e2 a2'

CodePudding user response：

For me, it is a little bit trickier that it looks at first. For example, it does look that "bbbaaacddddee" -> "b3a3c1d4e2" needs the count results to be outputted in the order of appearance in the passed string:

import re

def unique_elements(t):
    l = []
    for w in t:
        if w not in l:
            l.append(w)
    return l

def splitter(s):
    res = []
    tokens = re.split("[ ] ", s)
    for token in tokens:
        s1 = unique_elements(token) # or s1 = sorted(set(token))
        this_count = "".join([k   str(v) for k, v in list(zip(s1, [token.count(x) for x in s1]))])
        res.append(this_count)
    return " ".join(res)

print(splitter("aaaa"))
print(splitter("abb")) 
print(splitter("abbb cc a"))
print(splitter("bbbaaacddddee")) 
print(splitter("a   b"))

OUTPUT

a4
a1b2
a1b3 c2 a1
b3a3c1d4e2
a1 b1

If the order of appearance is not a real deal, you can disregard the unique_elements function and simply substitute something like s1 = sorted(set(token)) within splitter, as indicated in the comment.

CodePudding user response：

here is you answer

test_str = "here is your answer"
res = {}
list=[]
list=test_str.split()
# print(list)
for a in list:
    res={}
    for keys in a:
        res[keys] = res.get(keys, 0)   1
    for key,value in res.items():
        print(f"{key}{value}",end="")
    print(end=" ")

CodePudding user response：

There is no need to iterate every character in every word.
This is an alternate solution. (If you don't want to use itertools, that looked pretty tidy.)

def word_stats(data: str=""):
    all = []
    for word in data.split(" "):
        res = []
        while len(word)>0:
            res.append(word[:1]   str(word.count(word[:1])))
            word = word.replace(word[:1],"")
        res.sort()
        all.append("".join(res))
    return " ".join(all)

print(word_stats("asjssjbjbbhsiaiic ifiaficjxzjooro qoprlllkskrmsnm mmvvllvlxjxj jfnnfcncnnccnncsllsdfi"))
print(word_stats("abbb cc a"))
print(word_stats("bbbaaacddddee"))

This would output:

c5d1f3i1j1l2n7s2  
a1b3 c2 a1  
a3b3c1d4e2