I want to sort the following word pool according to occurrence of their 3-letter suffix, from most frequent to least frequent:
wordPool = ['beat','neat','food','good','mood','wood','bike','like','mike']
Expected output:
['food','good','mood','wood','bike','like','mike','beat','neat']
For simplicity, only 4-letter-words are in the pool and the suffix is always 3-letter ones.
(Note: If the counts are the same, then order can be arbitrary.)
CodePudding user response:
You can use collections.Counter()
to get the frequency of the suffixes, and then use sort()
with a key parameter to sort by the generated frequencies:
from collections import Counter
suffix_counters = Counter(s[-3:] for s in wordPool)
wordPool.sort(key=lambda x: suffix_counters[x[-3:]], reverse=True)
print(wordPool)
This outputs:
['food', 'good', 'mood', 'wood', 'bike', 'like', 'mike', 'beat', 'neat']
CodePudding user response:
- Group by suffix using a
dict
of lists; - Sort the groups by decreasing order of size;
- Join all the groups into a list.
def sorted_by_suffix_frequency(wordpool, n=3):
groups = {}
for w in wordpool:
groups.setdefault(w[-n:], []).append(w)
return [w for g in sorted(groups.values(), key=len, reverse=True) for w in g]
wordpool = ['beat','neat','food','good','mood','wood','bike','like','mike']
sorted_wordpool = sorted_by_suffix_frequency(wordpool)
print(sorted_wordpool)
# ['food', 'good', 'mood', 'wood', 'bike', 'like', 'mike', 'beat', 'neat']