I'm running into a problem where I need a sorted list of files that are in the format xxx_00000, xxx_00001. The issue is that when there are more than 100000 files the format becomes xxx_100000 while all the others stay the same. This means that when I do os.listdir(directory) I get xxx_10000 next to xxx_100000 (i.e. xxx_10000 is index 10,000 and xxx_100000 is index 10,001). Any ideas on how to sort this so that they appear in the correct order? I've tried:
sorted(paths)
sorted(paths, key=lambda x: x[x.rfind('_') 1:-4])
and
def sorted_helper(x):
x = str(00000) x[x.rfind('_') 1:-4]
return x[-7:]
sorted(paths, key=sorted_helper)
CodePudding user response:
You presumably want to sort it as an int, not as a string. Try:
sorted(paths, key=lambda filename: int(filename.split("_")[1]))
CodePudding user response:
You can use natsort.natsorted
:
from natsort import natsorted
natsorted(paths)
CodePudding user response:
The natsort
library can be helpful in such cases.
For example:
from natsort import natsorted
natsorted(paths)
Output:
['xxx_00001',
'xxx_00002',
'xxx_00100',
'xxx_01000',
'xxx_10000',
'xxx_100001',
'xxx_100010',
'xxx_110000']