I have a list of words. I figgured out how to count the occurrence of each word. I now want to know how many words appear how many times(?) in that list. The output should look something like this:
4.500 Words appeard 1 time
6.000 Words appeard 2 time ...
Example:
list = ["hello", "time", "burger", "hello", "mouse", "time", "time"]
Expected output:
3 Word occurred 1 time
1 Word orrcured 2 time
1 Word occurred 3 time
I hope it is clear what I mean to do. I cant share code really since I have no clue how to do it.
Is there an inbuilt function with Counter or Pandas that can do this?
Or does anyone have a smooth way of computing this ?
THANKS A LOT !
CodePudding user response:
Yes; use a Counter
to count the number of times each word appears, and then use a Counter
on that counter to count how many words appear each number of times.
>>> words = ["hello", "time", "burger", "hello", "mouse", "time", "time"]
>>> from collections import Counter
>>> word_counts = Counter(words)
>>> count_counts = Counter(word_counts.values())
>>> for times, words in count_counts.items():
... print(f"{words} words occurred {times} times")
...
1 words occurred 2 times
1 words occurred 3 times
2 words occurred 1 times
Note that word_counts
gives the counts per word, and then count_counts
gives the counts per count in word_counts
:
>>> word_counts
Counter({'time': 3, 'hello': 2, 'burger': 1, 'mouse': 1})
>>> count_counts
Counter({1: 2, 2: 1, 3: 1})
CodePudding user response:
If you know how to count the occurrences of each word, the procedure is straightforward. In pseudo-code:
- Let
A = n*[0]
, where n is the amount of words - For each word, let
m
be the number of times the word occurs. Then doA[m]
- For i in range n,
print(str(A[i]) " words appear " str(i) "times")