Home > Blockchain >  Sort a list in Python, ignore blanks and case
Sort a list in Python, ignore blanks and case

Time:01-15

I have a list (of dictionary keys), which I need to sort. This is my list:

listToBeSorted = ["Right  Coronary Artery 2", "Right Coronary Artery 1", "RIght Coronary Artery 3"]

Obviously, the order in which I'd like to have these items sorted would be:

["Right Coronary Artery 1", "Right  Coronary Artery 2", "RIght Coronary Artery 3"]

So I need to find a way to sort, ignoring the double blanks in the first item, and the uppercase "I" in the last item.

I tried the following sorting mechanisms:

  1. Plain sorting

    sortedList = sorted(listToBeSorted)
    

    will produce:

    ['RIght Coronary Artery 3',
     'Right  Coronary Artery 2',
     'Right Coronary Artery 1']
    
  2. Sorting, ignoring case:

    sortedList = sorted(listToBeSorted, key=str.casefold)
    

    will produce:

    ['Right  Coronary Artery 2',
     'Right Coronary Artery 1',
     'RIght Coronary Artery 3']
    
  3. Sorting, eliminating all blanks

    sortedList = sorted(listToBeSorted, key=lambda x: ''.join(x.split()))
    

    will produce:

    ['RIght Coronary Artery 3',
     'Right Coronary Artery 1',
     'Right  Coronary Artery 2']
    

I cannot change the entries themselves, as I need them to access the items in a dictionary later.

I eventually converted the list entries into a tuple, added an uppercase version without blanks, and sorted the list by the 2nd element of the tuple:

sortedListWithTwin = []
    
# Add an uppercase "twin" without whitespaces
for item in listToBeSorted:
  sortString = (item.upper()).replace(" ","")
  sortedListWithTwin.append((item, sortString))
       
# Sort list by the new "twin"
sortedListWithTwin.sort(key = lambda x: x[1])
    
# Remove the twin
sortedList = []
for item in sortedListWithTwin:
  sortedList.append(item[0])

This will produce the desired order:

['Right Coronary Artery 1',
 'Right  Coronary Artery 2',
 'RIght Coronary Artery 3']

However, this solution seems very cumbersome and inefficient. What would be a better way to solve this?

CodePudding user response:

sort using lambda

sortedList = sorted(listToBeSorted, key=lambda x: x.casefold().replace(" ", ""))
print(sortedList)

If you don't want to use replace for some reason. You could even use regex.
re.sub() function will replace all the whitespaces characters with an empty string. \s matches one or more consecutive whitespaces. Maintaining casefold() function to ignore case.

import re

sortedList = sorted(listToBeSorted, key=lambda x: re.sub(r"\s ", "", x.casefold()))
print(sortedList)

Output:

['Right Coronary Artery 1', 
'Right Coronary Artery 2', 
'RIght Coronary Artery 3']

CodePudding user response:

sortedList = sorted(listToBeSorted, key=lambda x: x.upper().replace(" ", ""))
print(sortedList)

print(sortedList)
#['Right Coronary Artery 1', 'Right  Coronary Artery 2', 'RIght Coronary Artery 3']
  • Related