Please consider this question:
Running a sqlite select
statement where I use order by tagTitle COLLATE NOCASE ASC
to sort my output:
$ echo 'select tagTitle from tags where bookName = "Dragons" order by tagTitle COLLATE NOCASE ASC ' | sqlite3 /tmp/IndexerSqlite.db |awk {'print substr($1,0,1)'} |uniq -c
1 &
3 -
2 .
2 /
1 1
1 ;
1 @
1 a
5 A
2 a
SNIP
3 W
4 X
1 x
1 X
1 Z
4 | <<--- This!
Notice how the |
is at the bottom - that's strange to me as everywhere else the |
symbol gets an earlier precedence when sorted.
Case in point:
$ echo "a
b
|
c
$
."|sort
.
| // <<-- this!
$
a
b
c
How can we get sqlite results in the reasonable order where |
gets stacked somewhere at the front, not at the tail end. Any advice must be much appreciated!
Repro:
$ sqlite3
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table test("name" blob);
sqlite> insert into test (name) values (".csdcdsc");
sqlite> insert into test (name) values ("XACDSC");
sqlite> insert into test (name) values ("ACDSC");
sqlite> insert into test (name) values ("CDSC");
sqlite> insert into test (name) values ("|CDSC");
sqlite> select * from test order by name COLLATE NOCASE ASC;
.csdcdsc
ACDSC
CDSC
XACDSC
|CDSC << --This
sqlite>
CodePudding user response:
Sqlite sorts strings in lexicographic order by codepoint. Pipe is U 007C, putting it after English letters and a bunch of other symbols. sort(1)
sorts based on the current locale's rules; you probably are using one that does something fancy with punctuation and symbols. Switch to one that uses plain lexicographic order too, and...
echo "a
b
|
c
$
." | LC_ALL=C sort
$
.
a
b
c
|
You'd have to write and load a Sqlite extension module that defines a new collation order. How easy that will be depends on how well you know C.