Home > front end >  How to sort a nested list in python when the nested lists have different lengths or missing elements
How to sort a nested list in python when the nested lists have different lengths or missing elements

Time:07-26

I have lists of items which I want to sort by its sub lists. I want to do so without modifying the original list other than sorting.

The list might look like:

[
    ("item A", dataA, [("sort 1a", 0.37), ("sort 2a", 0.11)], dataB, dataC),
    (
        "item B",
        dataA,
        [
            ("sort 1b", 0.37),
            ("sort 2b", 0.66),
            ("sort 3b", 0.85),
            ("sort 4b", 0.63),
            ("sort 5b", 0.26),
        ],
        dataB,
        dataC,
    ),
    (
        "item C",
        dataA,
        [("sort1c", 0.37), ("sort 2c", 0.15), ("sort 3c", 0.60)],
        dataB,
        dataC,
    ),
    (
        "item D",
        dataA,
        [
            ("sort 1d", 0.37),
            ("sort 2d", 0.66),
            ("sort 3d", 0.85),
            ("sort 4d", 0.63),
            ("sort 5d", 0.8),
        ],
        dataB,
        dataC,
    ),
]

I would like to first sort by the number following 'sort 1', then (where Sort 1s are equal) by 'sort 2' (if it exists), then (where Sort 1s and Sort 2s are equal) by 'sort 3' (if it exists) etc.

This would result in a list like:

[
    ("item A", dataA, [("sort 1a", 0.37), ("sort 2a", 0.11)], dataB, dataC),
    (
        "item C",
        dataA,
        [("sort1c", 0.37), ("sort 2c", 0.15), ("sort 3c", 0.60)],
        dataB,
        dataC,
    ),
    (
        "item B",
        dataA,
        [
            ("sort 1b", 0.37),
            ("sort 2b", 0.66),
            ("sort 3b", 0.85),
            ("sort 4b", 0.63),
            ("sort 5b", 0.26),
        ],
        dataB,
        dataC,
    ),
    (
        "item D",
        dataA,
        [
            ("sort 1d", 0.37),
            ("sort 2d", 0.66),
            ("sort 3d", 0.85),
            ("sort 4d", 0.63),
            ("sort 5d", 0.8),
        ],
        dataB,
        dataC,
    ),
]

I have tried approaches like taking the max length of the sort sub list, then incrementing i and sorting using key=lambda x: x[2][i][1] until i > length. But it just creates index errors since indexing fails on the shorter sub lists.

I've also tried using key=lambda x: (x[2][i][1] not in x, x.get(x[2][i][1], None) but get only works on dictionaries.

Help appreciated!

CodePudding user response:

One approach:

def extract_key(e):
    return [v for _, v in e[2]]

res = sorted(lst, key=extract_key)
print(res)

Output

[('item A', 10, [('sort 1a', 0.37), ('sort 2a', 0.11)], 15, 16),
 ('item C', 10, [('sort1c', 0.37), ('sort 2c', 0.15), ('sort 3c', 0.6)], 15, 16),
 ('item B', 10, [('sort 1b', 0.37), ('sort 2b', 0.66), ('sort 3b', 0.85), ('sort 4b', 0.63), ('sort 5b', 0.26)], 15, 16),
 ('item D', 10, [('sort 1d', 0.37), ('sort 2d', 0.66), ('sort 3d', 0.85), ('sort 4d', 0.63), ('sort 5d', 0.8)], 15, 16)]

This approach assumes the sort* items are always sorted and present, i.e. no:

('sort 1a', 0.37), ('sort 3a', 0.11)
('sort 2a', 0.37), ('sort 1a', 0.11)
  • Related