Suppose, I want to sort characters in a string by their frequency.
def frequencySort(self, s: str) -> str:
freq = {}
for i in s:
if i in freq:
freq[i] = 1
else:
freq[i] = 1
print(freq)
sorted_chars = sorted(s, key = lambda x : -freq[x])
return "".join(sorted_chars)
However, for inputs where some characters have equal frequency, these characters end up intermingled together in the output.
E.g., for input loveleetcode
the code above outputs eeeelolovtcd
whereas it should output eeeeoollvtcd
or eeeelloovtcd
(o's and l's should be clustered together, not intermixed).
So, I add a tie breaker that uses a fraction of the character's ASCII code (end's up being less than one, so guaranteed not to overpower next most frequent char):
def frequencySort(self, s: str) -> str:
freq = {}
for i in s:
if i in freq:
freq[i] = 1
else:
freq[i] = 1
# Tie-breaker:
for i in freq:
freq[i] = ord(i) / 1000
sorted_chars = sorted(s, key = lambda x : -freq[x])
return "".join(sorted_chars)
This works fine, but I was wondering if there is a more elegant (pythonic) way to enforce clustering of similar characters during the sort.
CodePudding user response:
Change the sort key to a tuple that includes the character itself.
def frequencySort(self, s: str) -> str:
freq = {}
for i in s:
if i in freq:
freq[i] = 1
else:
freq[i] = 1
sorted_chars = sorted(s, key=lambda x: (-freq[x], x))
return "".join(sorted_chars)
Now, when sorted()
encounters two different characters with the same count, it'll use the next element of the tuple to sort them. Since the next element is the character itself, the same characters will be sorted next to each other.
>>> solution.frequencySort("loveleetcode")
'eeeelloocdtv'
Unrelated note: you can use collections.Counter
instead of creating a dict and counting the characters.
freq = collections.Counter(s)
instead of
freq = {}
for i in s:
if i in freq:
freq[i] = 1
else:
freq[i] = 1
CodePudding user response:
You can use a collections.Counter
and its most_common
method that returns tuples of (char, count) sorted by count:
from collections import Counter
def frequencySort(s: str) -> str:
counts = Counter(s)
return ''.join([char * count for char, count in counts.most_common()])
print(frequencySort("loveleetcode"))
# 'eeeelloovtcd'