Home > database >  sort according to the numeric value at the end before file extention ".png"
sort according to the numeric value at the end before file extention ".png"

Time:05-24

    for f in os.listdir(path):
        print(f)
    

results in:

    TSR23_kaji_v004_10.png
    TSR23_kaji_v004_100.png
    TSR23_kaji_v004_1000.png
    TSR23_kaji_v004_11.png
    TSR23_kaji_v004_12.png
    TSR23_kaji_v004_13.png
    TSR23_kaji_v004_14.png
    TSR23_kaji_v004_15.png
    TSR23_kaji_v004_16.png
    TSR23_kaji_v004_200.png
    TSR23_kaji_v004_99.png

here my main problem is it doesn't sorts according to the digit in the last before the extension ".png"

I want to have output as follows:

    TSR23_kaji_v004_10.png
    TSR23_kaji_v004_11.png
    TSR23_kaji_v004_12.png
    TSR23_kaji_v004_13.png
    TSR23_kaji_v004_14.png
    TSR23_kaji_v004_15.png
    TSR23_kaji_v004_16.png
    TSR23_kaji_v004_99.png
    TSR23_kaji_v004_100.png
    TSR23_kaji_v004_200.png
    TSR23_kaji_v004_1000.png

Please guide me to get the results

CodePudding user response:

You need a way to extract the last number and convert it from a string to a number. A regular expression is one way to extract the text. For example, one or more digits followed by a .png character can be found with:

import re

def get_last_int(s):
    return int(re.search(r'\d (?=\.png)', s).group())

Passing it one of your strings gives the int:

get_last_int('TSR23_kaji_v004_200.png')
# 200

With that, you can use sorted() to sort based on the integers in question:

import re


def get_last_int(s):
    return int(re.search(r'\d (?=\.png)', s).group())

l = ['TSR23_kaji_v004_10.png',
 'TSR23_kaji_v004_100.png',
 'TSR23_kaji_v004_1000.png',
 'TSR23_kaji_v004_11.png',
 'TSR23_kaji_v004_12.png',
 'TSR23_kaji_v004_13.png',
 'TSR23_kaji_v004_14.png',
 'TSR23_kaji_v004_15.png',
 'TSR23_kaji_v004_16.png',
 'TSR23_kaji_v004_200.png',
 'TSR23_kaji_v004_99.png']


sorted(l, key=get_last_int)
# ['TSR23_kaji_v004_10.png',
#  'TSR23_kaji_v004_11.png',
#  'TSR23_kaji_v004_12.png',
#  'TSR23_kaji_v004_13.png',
#  'TSR23_kaji_v004_14.png',
#  'TSR23_kaji_v004_15.png',
#  'TSR23_kaji_v004_16.png',
#  'TSR23_kaji_v004_99.png',
#  'TSR23_kaji_v004_100.png',
#  'TSR23_kaji_v004_200.png',
#  'TSR23_kaji_v004_1000.png']

CodePudding user response:

Use natsort:

# pip install natsort
from natsort import natsorted

l = ['TSR23_kaji_v004_10.png',
 'TSR23_kaji_v004_100.png',
 'TSR23_kaji_v004_1000.png',
 'TSR23_kaji_v004_11.png',
 'TSR23_kaji_v004_12.png',
 'TSR23_kaji_v004_13.png',
 'TSR23_kaji_v004_14.png',
 'TSR23_kaji_v004_15.png',
 'TSR23_kaji_v004_16.png',
 'TSR23_kaji_v004_200.png',
 'TSR23_kaji_v004_99.png']


out = natsorted(l)

Output:

['TSR23_kaji_v004_10.png',
 'TSR23_kaji_v004_11.png',
 'TSR23_kaji_v004_12.png',
 'TSR23_kaji_v004_13.png',
 'TSR23_kaji_v004_14.png',
 'TSR23_kaji_v004_15.png',
 'TSR23_kaji_v004_16.png',
 'TSR23_kaji_v004_99.png',
 'TSR23_kaji_v004_100.png',
 'TSR23_kaji_v004_200.png',
 'TSR23_kaji_v004_1000.png']
  • Related