I'm trying to sort a list to while respecting the original sorting order and keeping its "logical integrity" – e.g. if an element is on the most right it will move to the most left when sorting. The background here is to make LDAP's sorting compatible with x509 certificates.
An element's distinguishedName
in LDAP looks as follows: CN=John Doe,OU=Marketing,OU=Users,DC=contoso,DC=loc
, but certificates require their names to be saved in the following format DC=loc,DC=contoso,OU=Users,OU=Marketing,CN=John Doe
.
Because I want to offer some error redundancy when users add normal x509 certificates I came up with a simple sort to normalize user inputs e.g. OU=Tech,ST=Berlin,C=DE,CN=test.com
becomes C=DE,ST=Berlin,OU=Tech,CN=test.com
; however I'm having some issues getting the desired results when parsing LDAP distinguishedName
s – it's not keeping the original ordering logic of the DC
s and OU
s.
My code looks like this:
SUBJECT_FIELDS = [
"DC",
"C",
"ST",
"L",
"O",
"OU",
"CN",
"emailAddress",
]
def sort_name(subject):
return sorted(subject, key=lambda e: SUBJECT_FIELDS.index(e[0]))
>>> subject = [('CN', 'John Doe'), ('OU', 'Users'), ('DC', 'contoso'), ('DC', 'loc')]
>>> sort_name(subject)
[('DC', 'contoso'), ('DC', 'loc'), ('OU', 'Users'), ('CN', 'John Doe')]
# should be [('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]
>>> subject = [('CN', 'John Doe'), ('OU', 'Marketing'), ('OU', 'Users'), ('OU', 'OtherUnit'), ('DC', 'contoso'), ('DC', 'loc')]
>>> sort_name(subject)
[('DC', 'contoso'), ('DC', 'loc'), ('OU', 'Marketing'), ('OU', 'Users'), ('OU', 'OtherUnit'), ('CN', 'John Doe')]
# should be [('DC', 'loc'), ('DC', 'contoso'), ('OU', 'OtherUnit'), ('OU', 'Users'), ('OU', 'Marketing'), ('CN', 'John Doe')]
>>> subject = [('DC', 'loc'), ('DC', 'contoso'), ('DC', 'ad'), ('OU', 'Users'), ('CN', 'John Doe')]
>>> sort_name(subject)
[('DC', 'loc'), ('DC', 'contoso'), ('DC', 'ad'), ('OU', 'Users'), ('CN', 'John Doe')]
# works as intended
>>> subject = [('OU', 'Users'), ('CN', 'John Doe'), ('DC', 'contoso'), ('DC', 'loc')]
>>> sort_name(subject)
[('DC', 'contoso'), ('DC', 'loc'), ('OU', 'Users'), ('CN', 'John Doe')]
# probably never going to happen that somebody screws an OU up like this, but should be:
# [('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]
So the issue is that sorted does not "reverse" multiple occurrences of the same sorting-key
, thus screwing up the original logical integrity of the name.
I have already tried extending the sorted
function with a weighted ordering based on how far on start or end of the list the subject field is.
def sort_name(subject):
half_index = len(subject) // 2
relative_index = lambda x: len(subject) - subject.index(x) \
if subject.index(x) 1 > half_index \
else subject.index(x) - len(subject)
sorted_fields = sorted(subject, key=lambda e: (SUBJECT_FIELDS.index(e[0]), relative_index(e)))
return sorted(subject, key=lambda e: (SUBJECT_FIELDS.index(e[0]), relative_index(e)))
But unfortunately that seems to involve a lot more unexpected results.
CodePudding user response:
Try:
def sort_func(s):
return [
t
for _, t in sorted(
enumerate(s),
key=lambda k: (SUBJECT_FIELDS.index(k[1][0]), -k[0]),
)
]
subject = [
("CN", "John Doe"),
("OU", "Users"),
("DC", "contoso"),
("DC", "loc"),
]
print(sort_func(subject))
Prints:
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]
Other subject
s:
subject = [
("CN", "John Doe"),
("OU", "Marketing"),
("OU", "Users"),
("OU", "OtherUnit"),
("DC", "contoso"),
("DC", "loc"),
]
print(sort_func(subject))
subject = [
("OU", "Users"),
("CN", "John Doe"),
("DC", "contoso"),
("DC", "loc"),
]
print(sort_func(subject))
subject = [
("DC", "ad"),
("DC", "contoso"),
("DC", "loc"),
("OU", "Users"),
("CN", "John Doe"),
]
print(sort_func(subject))
Prints:
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'OtherUnit'), ('OU', 'Users'), ('OU', 'Marketing'), ('CN', 'John Doe')]
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]
[('DC', 'loc'), ('DC', 'contoso'), ('DC', 'ad'), ('OU', 'Users'), ('CN', 'John Doe')]
EDIT 2:
SUBJECT_FIELDS = [
"DC",
"C",
"ST",
"L",
"O",
"OU",
"CN",
"emailAddress",
]
def sort_func(s):
if SUBJECT_FIELDS.index(s[0][0]) > SUBJECT_FIELDS.index(s[-1][0]):
return sorted(s[::-1], key=lambda k: SUBJECT_FIELDS.index(k[0]))
return sorted(s, key=lambda k: SUBJECT_FIELDS.index(k[0]))
subject = [
("CN", "John Doe"),
("OU", "Users"),
("DC", "contoso"),
("DC", "loc"),
]
print(sort_func(subject))
subject = [
("CN", "John Doe"),
("OU", "Marketing"),
("OU", "Users"),
("OU", "OtherUnit"),
("DC", "contoso"),
("DC", "loc"),
]
print(sort_func(subject))
subject = [
("DC", "loc"),
("DC", "contoso"),
("DC", "ad"),
("OU", "Users"),
("CN", "John Doe"),
]
print(sort_func(subject))
subject = [
("OU", "Users"),
("CN", "John Doe"),
("DC", "contoso"),
("DC", "loc"),
]
print(sort_func(subject))
subject = [
("DC", "loc"),
("DC", "contoso"),
("CN", "John Doe"),
("OU", "Users"),
]
print(sort_func(subject))
Prints:
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'OtherUnit'), ('OU', 'Users'), ('OU', 'Marketing'), ('CN', 'John Doe')]
[('DC', 'loc'), ('DC', 'contoso'), ('DC', 'ad'), ('OU', 'Users'), ('CN', 'John Doe')]
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]
[('DC', 'loc'), ('DC', 'contoso'), ('OU', 'Users'), ('CN', 'John Doe')]