My PostgreSQL DB appears to be using en_US.UTF-8
collation:
# SHOW lc_collate;
lc_collate
-------------
en_US.UTF-8
If I have a list of strings like: ['C - test', 'Common Scope']
, and I sort them in Python, I get:
sorted(['C - test', 'Common Scope'])
['C - test', 'Common Scope']
but in Postgres, I get the opposite order:
# select * from TEST ORDER BY name;
name
--------------
Common Scope
C - test
Having Postgres sort the same way as Python does seems to be achievable by adding COLLATE "C"
to the end of the select.
Is it possible to go the other way, and have Python sort strings the same was as Postgres does?
CodePudding user response:
You will need to use the Python locale module. Also the functools.cmp_to_key function:
import locale
from functools import cmp_to_key
locale.setlocale(locale.LC_ALL, 'en_US.UTF-8')
locale.getlocale()
('en_US', 'UTF-8')
sorted(['C - test', 'Common Scope'], key=cmp_to_key(locale.strcoll))
['Common Scope', 'C - test']
See Comparison Operators for more information. The important part is:
It is common to encounter comparison functions when translating algorithms from other languages. Also, some libraries provide comparison functions as part of their API. For example, locale.strcoll() is a comparison function.
To accommodate those situations, Python provides functools.cmp_to_key to wrap the comparison function to make it usable as a key function:
sorted(words, key=cmp_to_key(strcoll)) # locale-aware sort order