Home > Enterprise >  Python Arrays sorting folder names
Python Arrays sorting folder names

Time:05-06

Hi could anyone explain to me how to sort the folder names in order of creation?

    list1=["New folder", "New folder 1", "New folder 10", "New folder 2"]

    sorted(list1)
    print("output ",list1)

The outputs is currently:

>>  ["New folder", "New folder 1", "New folder 10", "New folder 2"]

I would like it show:

["New folder", "New folder 1", "New folder 2", "New folder 10"]

The reason I need this order is because I am opening directories via these folder names in order.

Below is the code I am using to grab these folder names maybe there is a way to sort them in the required order as I make the array?

     itrprev = iter(os.walk(previousdir))
        root, dirs, files = next(itrprev)     
    for next_root, next_dirs, next_files in itrprev:  # get second element onwards
        print("Next dir full path: : ", next_root)
        singlefoldername = os.path.split(next_root)
        fullfoldername = os.path.abspath(next_root)

CodePudding user response:

Using sorted() returns a copy of a sorted object. You were just returning the sorted copy and doing nothing with it. You need .sort() to modify the string.

You also need to create a key function to specify that you are sorting by the numbers at the end, not alphanumerically.

import re

def order(x):
    try:
        # This searches and returns the first number found in the folder name
        return int(re.search("\d ", x).group(0))
    except (ValueError, TypeError, AttributeError):
        return 0


list1.sort(key=order)
output  ['New folder', 'New folder 1', 'New folder 2', 'New folder 10']

Edit:

To account for letters as well:

list1 = ["Abc 1", "Abc 10", "Abc", "Test", "Test 20", "New folder",
         "New folder 1", "New folder 10", "New folder 2"]


def order(x):
    try:
        return x   re.search("\d ", x).group(0)
    except (ValueError, TypeError, AttributeError):
        return x


list1.sort(key=list)
output  ['Abc', 'Abc 1', 'Abc 10', 'New folder', 'New folder 1', 'New folder 10', 'New folder 2', 'Test', 'Test 20']

And for your other query. Sort after you have all the folder names, not as you're adding them.

CodePudding user response:

As the items in your list are all strings, "sorted" sorts the items by alphabetical order. All strings start with "New folder".

The very first item ends up right there, so its the first in line. The second one is obviously "New folder 1". The third one is "New folder 10" because, as alphabetical order means items are read character by character, character 11 of "New folder 10" is "1", and character 11 of "New folder 2" is "2".

There are (I guess) many ways of solving your problem, one may be:

sorted_list = sorted(list1, key=lambda x: int(x.split(" ")[-1]))

Edit: I've been correctly rectified in my answer by 2 members. First, the function to be used is sorted not sort.

Next up, my code will raise an error with the "New folder" that's missing an int, as the lambda will try to convert "folder" to a number.

You can try:

def sorter(x):
    try:
        i = int(x.split(" ")[-1])
        return i
    except ValueError:
        return 0

sorted_list = sorted(list1, key=lambda x: sorter(x))

CodePudding user response:

It's sorting these folders alphabetically since the values here are strings.

As a string, the correct order is:

New folder 1
New folder 10
New folder 2

So this could be hard problem if you don't know the format of the folders to begin with.

The easiest solution is to rename the folders:

New folder 01
New folder 02
New folder 10

But if you don't have standard naming format (some folders have numbers, some dont) then you'll have to conditially split the string at the number. Parse int on the number, and use that to flip the positions. But that also becomes really weird logic if you start having numbers between strings ("New 1 Folder") or especially if you have anything with a date (New Folder 3-20-2021).

CodePudding user response:

You need to create a custom sort function. The function will check if the end of each name is an ordered number or not, then proceed to sort accordingly. Defaults to ASC.

def customer_sort(foldername):
    foldername_list = foldername.split(' ')
    if not foldername_list[-1].isnumeric():
        return 0
    return int(foldername_list[-1])

Result

list1 = ["New folder", "New folder 1", "New folder 10", "New folder 2"]
sorted(list1, key= lambda item: customer_sort(item))
# ['New folder', 'New folder 1', 'New folder 2', 'New folder 10']
  • Related