Home > front end >  Sort list with alphabets and numbers in python
Sort list with alphabets and numbers in python

Time:10-19

Please help me handle with list I have a list name

arr = [{'name':'cator3'}
       {'name':'cator1'}
       {'name':'CATOR5 (Active A)'},
       {'name':'cator17'},
       {'name':'cator12'},
       {'name':'cator4'},
       {'name':'CATOR5 (Passive A)'},
       {'name':'cator23'},
       {'name':'cator2'}]

Each dict has a name containing both characters and numbers. I handled sort and I have result

My code:

def sort_order_by(e):
    order_by = 'name'
    return e[order_by].lower()

sort='asc'

if sort == 'asc':
    arr.sort(key=sort_order_by)
elif sort == 'desc':
     arr.sort(key=sort_order_by, reverse=True)
    
print(arr) 

And my result:

result = [{'name': 'cator1'},
          {'name': 'cator12'},
          {'name': 'cator17'},
          {'name': 'cator2'},
          {'name': 'cator23'},
          {'name': 'cator3'},
          {'name': 'cator4'},
          {'name': 'CATOR5 (Active A)'},
          {'name': 'CATOR5 (Passive A)'}]

You can see the wrong arrangement between the numbers after the initial text:

cator1, cator12, cator17, cator2, cator23, cator3 ...

But 2 < 3 < 12 < 17 < 23

I wish that there are correct results in numbers and letters

The result I expect will be in alphabetical and numerical order

expected = [{'name': 'cator1'},
            {'name': 'cator2'},
            {'name': 'cator3'},
            {'name': 'cator4'},
            {'name': 'CATOR5 (Active A)'},
            {'name': 'CATOR5 (Passive A)'},
            {'name': 'cator12'},
            {'name': 'cator17'},
            {'name': 'cator23'},]

How do I obtain the correct sort order?

CodePudding user response:

If you're OK using external libraries, I highly recommend natsort. Once you've run pip install natsort or conda install natsort or equivalent, you can do

from natsort import natsorted, ns

arr = natsorted(arr, alg=ns.IGNORECASE, reverse=sort == 'desc')

If you want in-place sorting, you can generate a sort key and use it with arr.sort:

from natsort import natsort_keygen, ns

arr.sort(key=natsort_keygen(alg=ns.IGNORECASE), reverse=sort == 'desc')

Disclaimer: I am not the author of natsort or otherwise affiliated with it. Although I did fix a minor typo in the documentation that one time.

CodePudding user response:

Below is a brief demonstrative example that goes through a process step-by-step. Of note, this is an arbitrary ordering specification and does not try to be too clever.

It also assumes that the strings will be of length 5 followed by a number. You can do a regular expression or similar process (or literal iteration) to identify the string if you'd like. You can also go more advanced and make a more general relationship hold (though it doesn't sound like you care about that).

arr=[
    {'name':'cator3'},
    {'name':'cator1'},
    {'name':'CATOR5 (Active A)'},
    {'name':'cator17'},
    {'name':'cator12'},
    {'name':'cator4'},
    {'name':'CATOR5 (Passive A)'},
    {'name':'cator23'},
    {'name':'cator2'}
]

def sort_order_by(e):
    order_by = 'name'
    key = e[order_by].lower()              ; print(key, "->", end=' ')
    split = key.split()
    rest = ' '.join(split[1:])
    key = split[0]                         ; print(key, "->", end=' ')
    key, nkey = key[:5], key[5:]           ; print(key, nkey, "->", end=' ')
    nkey = f"{int(nkey):05}"               ; print(key   nkey   rest)
    return key   nkey   rest

sort_type = 'asc'

arr.sort(key=sort_order_by, reverse=(sort_type == 'desc'))
    
[print(x) for x in arr]

OUTPUT:

cator3 -> cator3 -> cator 3 -> cator00003
cator1 -> cator1 -> cator 1 -> cator00001
cator5 (active a) -> cator5 -> cator 5 -> cator00005(active a)
cator17 -> cator17 -> cator 17 -> cator00017
cator12 -> cator12 -> cator 12 -> cator00012
cator4 -> cator4 -> cator 4 -> cator00004
cator5 (passive a) -> cator5 -> cator 5 -> cator00005(passive a)
cator23 -> cator23 -> cator 23 -> cator00023
cator2 -> cator2 -> cator 2 -> cator00002

{'name': 'cator1'}
{'name': 'cator2'}
{'name': 'cator3'}
{'name': 'cator4'}
{'name': 'CATOR5 (Active A)'}
{'name': 'CATOR5 (Passive A)'}
{'name': 'cator12'}
{'name': 'cator17'}
{'name': 'cator23'} 

CodePudding user response:

You could use a a regular expression substitution to right justify the numeric parts of the strings over a length 10. This will make them sort properly (in numerical order) within the alphanumeric order of strings.

This can be achieved using a lambda as the replacement value in re.sub():

arr = [{'name':'cator3'},
       {'name':'cator1'},
       {'name':'CATOR5 (Active A)'},
       {'name':'cator17'},
       {'name':'cator12'},
       {'name':'cator4'},
       {'name':'CATOR5 (Passive A)'},
       {'name':'cator23'},
       {'name':'cator2'}]

import re

arr.sort(key=lambda d: re.sub(r'\d*', 
                              lambda n: f"{n.group():>10}",
                              d['name'].lower()))

print(*arr,sep='\n')
{'name': 'cator1'}
{'name': 'cator2'}
{'name': 'cator3'}
{'name': 'cator4'}
{'name': 'CATOR5 (Active A)'}
{'name': 'CATOR5 (Passive A)'}
{'name': 'cator12'}
{'name': 'cator17'}
{'name': 'cator23'}

If you're going to be doing this often on different dictionary lists and/or using different keys, you could make a utility function for it:

import re
def alpha_num(k):
    return lambda d: re.sub(r'\d*',lambda n: f"{n.group():>10}",d[k].lower())

arr.sort(key=alpha_num('name'))
  • Related