I have a list containing some links: ["http://link1.rar", "http://link1.rev","http://link2.rar","http://link2.rev"] Is there a way to sort them, in order to look like: ["http://link1.rar", "http://link2.rar", "http://link1.rev", "http://link2.rev"]
I've tried with this:
def order(x):
if "rar" not in x:
return x
else:
return ""
new_links = sorted(links, key=order)
But in this way, rev links are sorted from the highest.
CodePudding user response:
You want to solve according to multiple criteria: first, the file extension; then, the whole string.
The usual trick to sort according to multiple criteria is to use a tuple as the key. Tuples are sorted in lexicographical order, which means the first criterion is compared first; and in case of a tie, the second criterion is compared.
For instance, the following key returns a tuple such as (False, 'http://link1.rar')
or (True, 'http://link1.rev')
:
new_links = sorted(links, key=lambda x: ('rar' not in x, x))
Alternatively, you could use str.rsplit
to split the string on the last '.'
and get a tuple such as ('http://link1', 'rev')
. Since you want the extension to be the first criterion, use slicing [::-1]
to reverse the order of the tuple:
new_links = sorted(links, key=lambda x: x.rsplit('.', maxsplit=1)[::-1])
Note that using rsplit('.', maxsplit=1)
to split on the last '.'
is a bit of a hack. If the filename contains a double extension, such as '.tar.gz'
, only the last extension will be isolated.
One last point to consider: numbers in strings. As long as your strings contain only single digits, such as '1'
or '2'
, sorting the strings will work as you expect. However, if you have a 'link10'
somewhere in there, the order might not be the one you expect: lexicographically, 'link10'
comes before 'link2'
, because the first character of '10'
is '1'
. In that case, I refer you to this related question: Is there a built in function for string natural sort?