Home > Enterprise >  Sort order and pipe symbol
Sort order and pipe symbol

Time:11-08

Running an SQLite select query where I use order by tagTitle COLLATE NOCASE ASC to sort :

$ echo 'select tagTitle from tags where bookName = "Dragons" order by tagTitle COLLATE NOCASE ASC ' | sqlite3 /tmp/IndexerSqlite.db |awk {'print substr($1,0,1)'} |uniq -c
      1 &
      3 -
      2 .
      2 /
      1 1
      1 ;
      1 @
      1 a
      5 A
      2 a
...
      3 W
      4 X
      1 x
      1 X
      1 Z
      4 | <<--- This!

| is at the bottom. Everywhere else | gets higher precedence:

$ echo "a
b
|
c
$
."|sort 
.
|    // <<-- this!
$
a
b
c

How can I get results in regular order where | is not at the end?

Repro:

$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite>  create table test("name" blob);
sqlite>  insert into test (name) values (".csdcdsc");
sqlite>  insert into test (name) values ("XACDSC");
sqlite>  insert into test (name) values ("ACDSC");
sqlite>  insert into test (name) values ("CDSC");
sqlite>  insert into test (name) values ("|CDSC");
sqlite> select * from test order by name  COLLATE NOCASE ASC;
.csdcdsc
ACDSC
CDSC
XACDSC
|CDSC << --This
sqlite> 

CodePudding user response:

Sqlite sorts strings in lexicographic order by codepoint. Pipe is U 007C, putting it after English letters and a bunch of other symbols. sort(1) sorts based on the current locale's rules; you probably are using one that does something fancy with punctuation and symbols. Switch to one that uses plain lexicographic order too, and...

echo "a
b
|
c
$
." | LC_ALL=C sort 
$
.
a
b
c
|

You'd have to write and load a Sqlite extension module that defines a new collation order. How easy that will be depends on how well you know C.

  • Related