Background
I really want to rename 16,494 thousand files that all end with a site number (1 through 15,000) and a category number (1 through 8). To do that I wanted to loop through the files and rename them. The problem is that I can't figure out how in the world to sort these files paths in the list I have.
First I use this to get the list of file paths in my CWD:
import os
import shutil
from pathlib import Path
import glob
lst = os.listdir(os.getcwd())
Then I get a list that is pretty random. It usually starts at 10,000_1. I will provide a short version of the list that can work as an example.
lst = ['10000_1.txt','10000_2.txt','10000_3.txt','10000_4.txt','10000_5.txt','10000_6.txt','10000_7.txt','10000_8.txt',
'1000_1.txt','1000_2.txt','1000_3.txt','1000_4.txt','1000_5.txt','1000_6.txt','1000_7.txt','1000_8.txt',
'16494_1.txt','16494_2.txt','16494_3.txt','16494_4.txt','16494_5.txt','16494_6.txt','16494_7.txt','16494_8.txt',
'100_1.txt','100_2.txt','100_3.txt','100_4.txt','100_5.txt','100_6.txt','100_7.txt','100_8.txt',
'1_1.txt','1_2.txt','1_3.txt','1_4.txt','1_5.txt','1_6.txt','1_7.txt','1_8.txt']
In short we have 5 sites here with 8 category numbers: 1_(1 through 8), 100_(1 through 8), 1000_(1 through 8), 10000_(1 through 8), and 16494_(1 through 8). They are all .txt.
What I tried
lst = lst.sort()
print(lst)
I don't know what to do. I have tried other things, but I don't get anything or it doesn't sort anything. I want it to look like this:
What I want
lst = ['1_1.txt','1_2.txt','1_3.txt','1_4.txt','1_5.txt','1_6.txt','1_7.txt','1_8.txt',
'100_1.txt','100_2.txt','100_3.txt','100_4.txt','100_5.txt','100_6.txt','100_7.txt','100_8.txt',
'1000_1.txt','1000_2.txt','1000_3.txt','1000_4.txt','1000_5.txt','1000_6.txt','1000_7.txt','1000_8.txt',
'10000_1.txt','10000_2.txt','10000_3.txt','10000_4.txt','10000_5.txt','10000_6.txt','10000_7.txt','10000_8.txt',
'16494_1.txt','16494_2.txt','16494_3.txt','16494_4.txt','16494_5.txt','16494_6.txt','16494_7.txt','16494_8.txt']
Any help would be appreciated!
CodePudding user response:
You need to use a custom key for the sorting:
>>> sorted(lst, key=lambda x: (int(x.split("_")[0]), int(x.split("_")[1].split(".")[0])))
Or:
>>> sorted(lst, key=lambda x: tuple(map(int, x.rstrip(".txt").split("_"))))
CodePudding user response:
You can simply use the split string as key:
sorted(lst, key=lambda x: x.split('_'))
Output:
['1_1.txt', '1_2.txt', '1_3.txt', '1_4.txt', '1_5.txt', '1_6.txt', '1_7.txt', '1_8.txt', '100_1.txt', '100_2.txt', '100_3.txt', '100_4.txt', '100_5.txt', '100_6.txt', '100_7.txt', '100_8.txt', '1000_1.txt', '1000_2.txt', '1000_3.txt', '1000_4.txt', '1000_5.txt', '1000_6.txt', '1000_7.txt', '1000_8.txt', '10000_1.txt', '10000_2.txt', '10000_3.txt', '10000_4.txt', '10000_5.txt', '10000_6.txt', '10000_7.txt', '10000_8.txt', '16494_1.txt', '16494_2.txt', '16494_3.txt', '16494_4.txt', '16494_5.txt', '16494_6.txt', '16494_7.txt', '16494_8.txt']
Another alternative is to use natsorted
from natsort import natsorted
natsorted(lst)
CodePudding user response:
def sort_rule(filename):
return filename.split('_')[0]
list.sort(key=sort_rule)
I think all you need to do is look for the site number. The values that have the same site number will be sorted by category.