I have a list:
lst = ['45MO_221115_High4_d-1.tif', '45MO_221115_High4_d-2.tif', '45MO_221115_High4_d0.tif', '45MO_221115_High4_d1.tif', '45MO_221115_High4_d3.tif',
'45MO_221115_Low1_d-1.tif', '45MO_221115_Low1_d-2.tif', '45MO_221115_Low1_d0.tif', '45MO_221115_Low1_d1.tif', '45MO_221115_Low1_d3.tif',
'45MO_221115_Med2_d-1.tif', '45MO_221115_Med2_d-2.tif', '45MO_221115_Med2_d0.tif', '45MO_221115_Med2_d1.tif', '45MO_221115_Med2_d3.tif']
How can I sort the list so that the day -2 images (d-2) appear before the day -1 (d-1)? All the d-2 can appear at the beginning of each set so that:
lst_s = ['45MO_221115_High4_d-2.tif', '45MO_221115_High4_d-1.tif', '45MO_221115_High4_d0.tif', '45MO_221115_High4_d1.tif', '45MO_221115_High4_d3.tif',
'45MO_221115_Low1_d-2.tif', '45MO_221115_Low1_d-1.tif', '45MO_221115_Low1_d0.tif', '45MO_221115_Low1_d1.tif', '45MO_221115_Low1_d3.tif',
'45MO_221115_Med2_d-2.tif', '45MO_221115_Med2_d-1.tif', '45MO_221115_Med2_d0.tif', '45MO_221115_Med2_d1.tif', '45MO_221115_Med2_d3.tif']
or all the d-2 can be grouped at the beginning:
lst_s = ['45MO_221115_High4_d-2.tif', '45MO_221115_Low1_d-2.tif', '45MO_221115_Med2_d-2.tif',
'45MO_221115_High4_d-1.tif', '45MO_221115_Low1_d-1.tif', ...]
Both variants are fine. Second one is probably easier.
CodePudding user response:
One approach to match the second output (first d-2, then d-1, then the rest) is to do:
res = sorted(lst, key=lambda s: ('d-2' not in s, 'd-1' not in s))
print(res)
Output
['45MO_221115_High4_d-2.tif',
'45MO_221115_Low1_d-2.tif',
'45MO_221115_Med2_d-2.tif',
'45MO_221115_High4_d-1.tif',
'45MO_221115_Low1_d-1.tif',
'45MO_221115_Med2_d-1.tif',
'45MO_221115_High4_d0.tif',
'45MO_221115_High4_d1.tif',
'45MO_221115_High4_d3.tif',
'45MO_221115_Low1_d0.tif',
'45MO_221115_Low1_d1.tif',
'45MO_221115_Low1_d3.tif',
'45MO_221115_Med2_d0.tif',
'45MO_221115_Med2_d1.tif',
'45MO_221115_Med2_d3.tif']
CodePudding user response:
Use natsort directly
from natsort import realsorted, ns
sort = realsorted(your_list)
print(sort)
Gives #
['45MO_221115_High4_d-2.tif',
'45MO_221115_High4_d-1.tif',
'45MO_221115_High4_d0.tif',
'45MO_221115_High4_d1.tif',
'45MO_221115_High4_d3.tif',
'45MO_221115_Low1_d-2.tif',
'45MO_221115_Low1_d-1.tif',
'45MO_221115_Low1_d0.tif',
'45MO_221115_Low1_d1.tif',
'45MO_221115_Low1_d3.tif',
'45MO_221115_Med2_d-2.tif',
'45MO_221115_Med2_d-1.tif',
'45MO_221115_Med2_d0.tif',
'45MO_221115_Med2_d1.tif',
'45MO_221115_Med2_d3.tif']
CodePudding user response:
The sorted
function takes a key
argumentdocumentation. The value you pass to this argument can be a function that takes one argument - the element of the list being sorted - and return a "key" to use to sort the item instead of the item itself.
We can define this function to take the file name and return the relevant digits as an integer, using regular expressions to capture the relevant portion of the filename. Explanation of regex
import re
def file_to_key(filename):
digits = re.findall(r"d(-?\d )\.tif", filename)[0]
return int(digits)
files = ['45MO_221115_High4_d-1.tif', '45MO_221115_High4_d-2.tif', '45MO_221115_High4_d0.tif', '45MO_221115_High4_d1.tif', '45MO_221115_High4_d3.tif', '45MO_221115_Low1_d-1.tif', '45MO_221115_Low1_d-2.tif', '45MO_221115_Low1_d0.tif', '45MO_221115_Low1_d1.tif', '45MO_221115_Low1_d3.tif', '45MO_221115_Med2_d-1.tif', '45MO_221115_Med2_d-2.tif', '45MO_221115_Med2_d0.tif', '45MO_221115_Med2_d1.tif', '45MO_221115_Med2_d3.tif']
files_s = sorted(files, key=file_to_key)
print(files_s)
which gives the desired order:
['45MO_221115_High4_d-2.tif', '45MO_221115_Low1_d-2.tif', '45MO_221115_Med2_d-2.tif',
'45MO_221115_High4_d-1.tif', '45MO_221115_Low1_d-1.tif', '45MO_221115_Med2_d-1.tif',
'45MO_221115_High4_d0.tif', '45MO_221115_Low1_d0.tif', '45MO_221115_Med2_d0.tif',
'45MO_221115_High4_d1.tif', '45MO_221115_Low1_d1.tif', '45MO_221115_Med2_d1.tif',
'45MO_221115_High4_d3.tif', '45MO_221115_Low1_d3.tif', '45MO_221115_Med2_d3.tif']
To get the order you show in your first output option, we need to sort by part of the filename before the integer we found, and then the integer from in the previous snippet. Doing that is easy -- have the key
function return those values in that order, in a tuple:
def file_to_key(filename):
search = re.findall(r"(.*d)(-?\d )\.tif", filename)
file_str, digits = search[0]
return (file_str, int(digits))
files_s = sorted(files, key=file_to_key)
print(files_s)
Which gives:
['45MO_221115_High4_d-2.tif', '45MO_221115_High4_d-1.tif', '45MO_221115_High4_d0.tif', '45MO_221115_High4_d1.tif', '45MO_221115_High4_d3.tif',
'45MO_221115_Low1_d-2.tif', '45MO_221115_Low1_d-1.tif', '45MO_221115_Low1_d0.tif', '45MO_221115_Low1_d1.tif', '45MO_221115_Low1_d3.tif',
'45MO_221115_Med2_d-2.tif', '45MO_221115_Med2_d-1.tif', '45MO_221115_Med2_d0.tif', '45MO_221115_Med2_d1.tif', '45MO_221115_Med2_d3.tif']