We have a table
called tblINUser
, which has many records and occupies a vast amount of space. In an attempt to reduce the amount of used space, we create a table called tblINUserSortByFilter
which contains all the possible text values of this field and we create a foreign key in tblINUser
that numerically references this value. We have several databases, because this database is sharded, so it would be great to sort the values similarly accross databases. This was the first attempt:
CREATE TABLE MC.tblINUserSortByFilterType(
pkINUserSortByFilterTypeID SMALLINT(6) PRIMARY KEY AUTO_INCREMENT,
SortByFilter varchar(45) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT 'first',
INDEX(SortByFilter)
);
INSERT INTO MC.tblINUserSortByFilterType(SortByFilter)
SELECT DISTINCT SortByFilter
FROM MC.tblINUser
ORDER BY SortByFilter = 'first';
ALTER TABLE MC.tblINUser
ADD COLUMN fkINUserSortByFilterTypeID SMALLINT(6) DEFAULT 1,
ADD INDEX (fkINUserSortByFilterTypeID);
UPDATE MC.tblINUser INUser
JOIN MC.tblINUserSortByFilterType INUserSortByFilterType
ON INUser.SortByFilter = INUserSortByFilterType.SortByFilter
SET INUser.fkINUserSortByFilterTypeID = INUserSortByFilterType.pkINUserSortByFilterTypeID;
ALTER TABLE MC.tblINUser
DROP COLUMN SortByFilter;
You may argue, correctly that the sort has the only criteria, which is ORDER BY SortByFilter = 'first'
and a clause of ORDER BY SortByFilter = 'first', SortByFilter
would be an obvious improvement. This would be a correct criticism, yet, even though we may have a chaotic behavior starting from the second record, it would be reasonable to expect that the very first inserted record would be first
, yet, unfortunately, this is not the case. Running select * from MC.tblINUserSortByFilterType;
yields
---------------------------- ----------------------------
| pkINUserSortByFilterTypeID | SortByFilter |
---------------------------- ----------------------------
| 5 | first |
| 4 | first-ASC |
| 3 | last |
| 1 | none |
| 2 | StatTeacher.IsActive DESC |
---------------------------- ----------------------------
as we can see, not even this expectation is met, since first
has an id of 5. An improvement is achieved by changing the inserts to
INSERT INTO MC.tblINUserSortByFilterType(SortByFilter)
SELECT DISTINCT SortByFilter
FROM MC.tblINUser
WHERE SortByFilter = 'first';
INSERT INTO MC.tblINUserSortByFilterType(SortByFilter)
SELECT DISTINCT SortByFilter
FROM MC.tblINUser
WHERE SortByFilter <> 'first';
and then the result of the same selection we get this result:
---------------------------- ----------------------------
| pkINUserSortByFilterTypeID | SortByFilter |
---------------------------- ----------------------------
| 1 | first |
| 3 | first-ASC |
| 4 | last |
| 2 | none |
| 5 | StatTeacher.IsActive DESC |
---------------------------- ----------------------------
5 rows in set (0.00 sec)
as we can see, first
is correctly receiving a value of 1. Yet, it seems that if we run the same inserts over different copies of the database, the order of subsequent rows might be unreliable. So, how could we ensure that the records would be inserted in the exact order that the following query yields?
SELECT DISTINCT SortByFilter
FROM MC.tblINUser
WHERE SortByFilter = 'first', SortByFilter;
I know that we can solve this by
- using a cursor for the insert
- looping the records received
- inserting them individually
But that would have as many insert
statements as the number of records the above query yields. Is there a way to achieve the same with a single command?
CodePudding user response:
it would be reasonable to expect that the very first inserted record would be
first
I don't think so. You used ORDER BY SortByFilter = 'first'
which returns 0 for all values except 'first', followed by 1 for 'first'. The value 1 sorts after the value 0, so the entry 'first' ends up being last. The other values end up sorting more or less randomly.
Demo:
mysql> create table mytable (SortByFilter varchar(64));
Query OK, 0 rows affected (0.02 sec)
mysql> insert into mytable values ('first'), ('first-ASC'),
('last'), ('none'), ('StatTeacher.IsActive DESC');
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> select SortByFilter='first', SortByFilter from mytable
order by SortByFilter = 'first';
---------------------- ---------------------------
| SortByFilter='first' | SortByFilter |
---------------------- ---------------------------
| 0 | first-ASC |
| 0 | last |
| 0 | none |
| 0 | StatTeacher.IsActive DESC |
| 1 | first |
---------------------- ---------------------------
I suggest do not rely on automatic sorting. Be specific about the sort order of every value. Here's one way to do it:
mysql> select field(SortByFilter, 'first', 'first-ASC',
'none', 'StatTeacher.IsActive DESC', 'last') AS SortOrder,
SortByFilter
from mytable order by SortOrder;
----------- ---------------------------
| SortOrder | SortByFilter |
----------- ---------------------------
| 1 | first |
| 2 | first-ASC |
| 3 | none |
| 4 | StatTeacher.IsActive DESC |
| 5 | last |
----------- ---------------------------
CodePudding user response:
To get the rows in a particular order, you must use an ORDER BY
. That is straightforward to do if the object of the ORDER BY
is a string and you want alphabetical order, or it is numeric and you want it in numeric order. Ditto for the reverse by using DESC
.
For for some abnormal ordering, here is one trick:
ORDER BY FIND_IN_SET(my_column, "first,second,third,fourth")
Another:
ORDER BY my_column != 'first', my_column
That will list 'first' first, then do the rest in alphabetic order. (I am assuming my_column
is a VARCHAR
.)
ORDER BY my_column = 'last', my_column
Note that a boolean expression evaluates to 0 (for false) or 1 (for true); I am then depending on the sort order of 0 and 1.