Home > Software design >  how to sort when given a set of precedence pairs/rules
how to sort when given a set of precedence pairs/rules

Time:10-16

I have a set of tuples that represent precedence. For example, the tuple (a, b) means that a must come before b (but not necessarily immediately). How can I use this precedence set in a sort function such that the elements of a given list thereafter obey the precedence constraints?

I recognize that such an approach might have conflicts/cycles. However, I'm certain that my precedence construction doesn't produce this problem. It can be tested by adding all constraints to a DAG, ensuring no cycles arise.

I attempted a custom comparer that returns less-than for situations where the pair is in the set and greater-than for all other situations. However, this does not produce the desired result of having all precedence constraints satisfied. E.g:

a = [3, 2, 7, 6, 5]
ps = set([(5, 2)])
def cmp(a, b):
    if (a, b) in ps: return -1
    if (b, a) in ps: return 1
    return 0
import functools as ft
sorted(a, key=ft.cmp_to_key(cmp))
Out[9]: [3, 2, 7, 6, 5]  # bad result, expected 5 < 2

CodePudding user response:

This can't work. The problem is that sorting does not compare every element to every other element, so even if comparing 2 and 5 would say 5 is less than 2, it's quite likely 2 and 5 will never be directly compared; 2 might be compared to 7 and 7 to 6 and 6 to 5, and since they all compare equal, Python assumes that 2 is equal to 6 and 5 without ever having compared them (in fact, the TimSort algorithm Python uses has a galloping mode to optimize sorting already sorted/mostly sorted lists that likely performs the comparisons I described directly; in another language's quicksort with a random pivot it might occasionally get the result you want, but Python sorting is more predictable and less likely to perform your desired comparison unless the values are already adjacent to each other).

The rules for sorting require transitive relationships to allow a general purpose sorting algorithm to work (Python assumes the values involved form a "total ordering"); your comparator isn't transitive (5 < 2, but 5 == <EVERY OTHER NUMBER> and 2 == <EVERY OTHER NUMBER>).

There are ways to handle sorting where you only have a partial, not a total, ordering, but you're implementing them yourself, Python can't do it for you.

  • Related