Home > database >  How can I sort a list of String based on character inside of them?
How can I sort a list of String based on character inside of them?

Time:11-14

I have a list of Strings that looks like that:

['training_tech26.txt', 'training_tech41.txt', 'training_tech68.txt', 'training_tech84.txt', 'training_tech52.txt', 'training_sales17.txt', 'training_sales2.txt', 'training_tech47.txt', 'training_sales23.txt', 'training_sales3.txt', 'training_tech9.txt', 'training_tech12.txt']

I need to sort these files to be in a right order, like:

['training_tech1.txt', 'training_tech2.txt', 'training_sales3.txt', 'training_tech4.txt', 'training_tech5.txt']

I am using these code to access all files inside my folder and append them into one list. In the folder itself they are placed in a right order, so I don't know why there are messed up in this list.

tech_dir_path = "/path/to/folder/with/files"
res = []
tech_res = os.listdir(tech_dir_path)

CodePudding user response:

Looking at your sample inputs, here is my code:

input = ['training_tech26.txt', 'training_tech41.txt', 'training_tech68.txt', 'training_tech84.txt', 'training_tech52.txt', 'training_sales17.txt', 'training_sales2.txt', 'training_tech47.txt', 'training_sales23.txt', 'training_sales3.txt', 'training_tech9.txt', 'training_tech12.txt']

def sorter(input):
    idx = [int(''.join([j for j in i if j.isdigit()])) for i in input] # gets the numbers in int
    idx.sort() # sort numbers
    res = []
    for i in idx: # iterate over idx
        for j in input: # iterate over input
            if i == int(''.join([k for k in j if k.isdigit()])): # checks if i is same as the j input
                res.append(j)
    return res
print(sorter(input))

You can also have a look here for more methods.

CodePudding user response:

You can use the key argument to sort to extract, say, the number preceding the file extension.

import re

# TODO handle the possibility of a failed match
def get_number(s):
    n = re.search(r'(\d ).txt', s).group(1)
    return int(n)

Then, for example,

>>> files = ['training_tech26.txt', 'training_tech41.txt', 'training_tech68.txt', 'training_tech84.txt', 'training_tech52.txt', 'training_sales17.txt', 'training_sales2.txt', 'training_tech47.txt', 'training_sales23.txt', 'training_sales3.txt', 'training_tech9.txt', 'training_tech12.txt']

>>> for f in sorted(files, key=get_number):
...  print(f)
...
training_sales2.txt
training_sales3.txt
training_tech9.txt
training_tech12.txt
training_sales17.txt
training_sales23.txt
training_tech26.txt
training_tech41.txt
training_tech47.txt
training_tech52.txt
training_tech68.txt
training_tech84.txt

os.listdir returns a list sorted alphabetically, in which case '12' < '9' because 1 comes before 9.

CodePudding user response:

There is python third party library for string natural sorting called natsort:

from natsort import natsorted
tech_res = ['training_tech26.txt', 'training_tech41.txt', 'training_tech68.txt', 'training_tech84.txt', 'training_tech52.txt', 'training_sales17.txt', 'training_sales2.txt', 'training_tech47.txt', 'training_sales23.txt', 'training_sales3.txt', 'training_tech9.txt', 'training_tech12.txt']

print(natsorted(tech_res, key=lambda y: y.lower()))

Output:

['training_sales2.txt', 'training_sales3.txt', 'training_sales17.txt', 'training_sales23.txt', 'training_tech9.txt', 'training_tech12.txt', 'training_tech26.txt', 'training_tech41.txt', 'training_tech47.txt', 'training_tech52.txt', 'training_tech68.txt', 'training_tech84.txt']
  • Related