Home > Net >  What exactly means the collation 'de-DE-u-kn-true'
What exactly means the collation 'de-DE-u-kn-true'

Time:11-05

In order to sort a PostgreSQL 14 varchar column with numerical values, I'm using a collation created by

CREATE COLLATION de_pos (LOCALE = 'de-DE-u-kn-true', PROVIDER = 'icu');`

With such a collation ORDER BY results in correct order like this
1.2.10
1.2.20
1.2.100

Without that special collation ORDER BY results in
1.2.10
1.2.100
1.2.20

I want to know, what every part of that collation definition exactly means ?

de-DE => locale for german-germany
u => UTF8 ????
kn => ????
true => ????

CodePudding user response:

See ICU Collations in the PostgreSQL documentation. This links to the ICU documentation, which - with some indirection - leads to Unicode Locale Identifier, which makes clear that the -u introduces the Unicode Locale Extensions, and kn is one of those extensions. When you look at Collation Settings, you'll find kn configures numeric ordering. The true is the configuration of that option (meaning, numeric ordering is on):

If set to on, any sequence of Decimal Digits (General_Category = Nd in the [UAX44]) is sorted at a primary level with its numeric value. For example, "A-21" < "A-123". The computed primary weights are all at the start of the digit reordering group. Thus with an untailored UCA table, "a$" < "a0" < "a2" < "a12" < "a⓪" < "aa".

This is sometimes called “natural sort order”.

In other words, de-DE-u-kn-true is:

  • de: language German
  • DE: region Germany
  • u: what follows are Unicode Locale Extension
  • kn: Unicode Locale Extension numeric ordering
  • true: value of kn, meaning numeric ordering is on

CodePudding user response:

You can customize your collation's sorting behavior by specifying attribute names. In this case, kn-true stands for natural sorting order or numeric sort order that sorts numbers based on their numeric value.

  • Related