Home > Software design >  Sorting list of strings based on multiple regex groups in python
Sorting list of strings based on multiple regex groups in python

Time:03-05

I have a list of strings of the form:

my_list=['i99_W_t10', 'i99_M_t11', 'i94_M_t12', 'i93_W_t2', ..., 'i14_M_t19']

(so always 3 fields sepparated by _. I would like to sort this list first by the 2nd element, then the 3rd and then the 1st. So the list above would become:

my_list=['i99_M_t11',  'i94_M_t12', 'i14_M_t19', 'i93_W_t2', 'i99_W_t10', ... ]

I know how to do this by turning the list into a pandas dataframe, splitting fields into columns, sorting them, and returning the sorted list. But perhaps there is a more elegant way done directly on the list instead of having to go for dataframes?

CodePudding user response:

Split each string on '_' and then use operator.itemgetter to extract items from each list in your particular order

from operator import itemgetter
my_list=['i99_W_t10', 'i99_M_t11', 'i94_M_t12', 'i93_W_t2','i14_M_t19']
key_func = lambda x: itemgetter(1, 2, 0)(x.split('_'))
sorted(my_list, key=key_func)
# ['i99_M_t11', 'i94_M_t12', 'i14_M_t19', 'i99_W_t10', 'i93_W_t2']

Example of itemgetter in action

itemgetter(1, 2, 0)(['a', 'b', 'c'])
# ('b', 'c', 'a')

itemgetter(2, 1, 0)(['a', 'b', 'c'])
# ('c', 'b', 'a')

CodePudding user response:

You can do it like this...

my_list = ['i99_W_t10', 'i99_M_t11', 'i94_M_t12', 'i93_W_t2', 'i14_M_t19']
# print(my_list)

def getValuesFromListItem(n):
    global my_list
    mvl = []
    for i in my_list:
        mvl  = [i.split("_")[n]]
    return mvl

def getListItemFromValue(val, n):
    res = ""
    global my_list
    k = 0
    for i in my_list:
        l = i.split("_")
        if l[n] == val:
            res = i
            del my_list[k]
            break
        k  = 1
    return res

l1 = getValuesFromListItem(1)
l2 = sorted(l1)
my_list2 = []
for i in l2:
    my_list2.append(getListItemFromValue(i, 1))

my_list = my_list2
my_list2 = []

l1 = getValuesFromListItem(2)
l2 = sorted(l1)
for i in l2:
    my_list2.append(getListItemFromValue(i, 2))

my_list = my_list2
my_list2 = []

l1 = getValuesFromListItem(0)
l2 = sorted(l1)
for i in l2:
    my_list2.append(getListItemFromValue(i, 0))

print(my_list2)

output-

['i99_W_t10', 'i99_M_t11', 'i94_M_t12', 'i93_W_t2', 'i14_M_t19']

['i14_M_t19', 'i93_W_t2', 'i94_M_t12', 'i99_W_t10', 'i99_M_t11']

  • Related