Home > Net >  sqlite sort order and pipe symbol
sqlite sort order and pipe symbol

Time:11-07

Please consider this question:

Running a sqlite select statement where I use order by tagTitle COLLATE NOCASE ASC to sort my output:

$ echo 'select tagTitle from tags where bookName = "Dragons" order by tagTitle COLLATE NOCASE ASC ' | sqlite3 /tmp/IndexerSqlite.db |awk {'print substr($1,0,1)'} |uniq -c
      1 &
      3 -
      2 .
      2 /
      1 1
      1 ;
      1 @
      1 a
      5 A
      2 a
  SNIP
      3 W
      4 X
      1 x
      1 X
      1 Z
      4 | <<--- This!

Notice how the | is at the bottom - that's strange to me as everywhere else the | symbol gets an earlier precedence when sorted.

Case in point:

$ echo "a
b
|
c
$
."|sort 
.
|    // <<-- this!
$
a
b
c

How can we get sqlite results in the reasonable order where | gets stacked somewhere at the front, not at the tail end. Any advice must be much appreciated!

Repro:

$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite>  create table test("name" blob);
sqlite>  insert into test (name) values (".csdcdsc");
sqlite>  insert into test (name) values ("XACDSC");
sqlite>  insert into test (name) values ("ACDSC");
sqlite>  insert into test (name) values ("CDSC");
sqlite>  insert into test (name) values ("|CDSC");
sqlite> select * from test order by name  COLLATE NOCASE ASC;
.csdcdsc
ACDSC
CDSC
XACDSC
|CDSC << --This
sqlite> 

CodePudding user response:

Sqlite sorts strings in lexicographic order by codepoint. Pipe is U 007C, putting it after English letters and a bunch of other symbols. sort(1) sorts based on the current locale's rules; you probably are using one that does something fancy with punctuation and symbols. Switch to one that uses plain lexicographic order too, and...

echo "a
b
|
c
$
." | LC_ALL=C sort 
$
.
a
b
c
|

You'd have to write and load a Sqlite extension module that defines a new collation order. How easy that will be depends on how well you know C.

  • Related