Home > Back-end >  Complex lambda expression as the key argument to sorted function
Complex lambda expression as the key argument to sorted function

Time:09-17

I have an example code:

In [60]: sorted([3, 1, None], key=lambda x: (x is None, x))
Out[60]: [1, 3, None]

In [61]: sorted([3, 1, None], key=lambda x: (x is not None, x))
Out[61]: [None, 1, 3]

I think I do understand what it does - looks like it allows skipping the None values of the key during sorting, otherwise sorting will raise a TypeError trying to compare int to None - but I don't understand how and why it works as it does. Specifically I am confused by the lambda function that returns a tuple.

I could not find anything relevant in the sorting HOWTO. I would appreciate an explanation or a link to where this behavior is documented.

CodePudding user response:

Tuples are compared and sorted in lexicographical order:

>>> sorted([(0, 17), (1,15), (0,12), (0, 9), (1, 8), (1, 7), (0, 2)])
[(0, 2), (0, 9), (0, 12), (0, 17), (1, 7), (1, 8), (1, 15)]
>>> (0, 12) < (1, 9)
True
>>> (0, 12) < (0, 13)
True
>>> (1, 12) < (0, 9)
False

"Lexicographical order" is just a fancy word for "like words in an English dictionary": compare the first letter first, and only if the first letter is equal, then compare the second letter, and only if the first two letters are equal, then compare the third letter, etc.

Using a tuple as the key, the tuples will also be compared in lexicographical order.

In your case, x is None and x is not None evaluate to boolean values, True or False.

Boolean values can be compared too:

>>> False < True
True
>>> True < False
False

As a result, sorted([3, 1, None], key=lambda x: (x is None, x)) will consider that the smallest elements are those for which x is None is False, and the biggest elements are those for which x is None is True:

>>> sorted([3, 1, None], key=lambda x: (x is None, x))
[1, 3, None]
>>> sorted(map(lambda x: (x is None, x), [3, 1, None]))
[(False, 1), (False, 3), (True, None)]

CodePudding user response:

The key parameter of sorted is used to rank the values, in order of the elements if more than one.

What your code does is to put None first (or last), and as a secondary key, sort the non-None elements

First key: [3, 1, None] is mapped to [False, False, True] that is equivalent to [0, 0, 1] and pushes None to the end.

Second key is to break the tie between 3 and 1

  • Related