Home > Mobile >  Optimizing mySQL query to scale better
Optimizing mySQL query to scale better

Time:10-27

I want to query a database to retrieve a list of names (the list of names is provided by the user in python). My criteria for looking up data for these names are the following: the results should appear in the order of the list of names the user provided, so if I say ...WHERE name = "Bob" OR name = "Alice" I want the results for Bob to come first followed by that of Alice. The second criteria is that if there is a search for a name twice, then the result should also contain it twice, so I want a way to write down ...WHERE name = 'Bob' OR name = 'Bob' so that the result also contains the rows for Bob twice.

I came up with the following query:

SELECT * FROM
   (SELECT *, 1 order_position FROM table WHERE name = 'Alice'
    UNION ALL
    SELECT *, 2 order_position FROM table WHERE name = 'Bob'
    UNION ALL
    SELECT *, 3 order_position FROM table WHERE name = 'Charlie'
    UNION ALL 
    SELECT *, 4 order_position FROM table WHERE name = 'Dan'
   ) r ORDER BY order_position

This query works well, but when the user submits hundreds of names and there are hundreds of UNION ALL sections, the query becomes extremely slow. Is there a way to improve the performance of the query while maintaining the two criteria mentioned before?

CodePudding user response:

SELECT *, CASE name WHEN 'Alice'   THEN 1
                    WHEN 'Bob'     THEN 2
                    WHEN 'Charlie' THEN 3
                    WHEN 'Dan'     THEN 4
                    END AS order_position 
FROM table 
WHERE name IN ('Alice', 'Bob', 'Charlie', 'Dan')
ORDER BY order_position;

or without additional column:

SELECT *
FROM table 
WHERE name IN ('Alice', 'Bob', 'Charlie', 'Dan')
ORDER BY CASE name WHEN 'Alice'   THEN 1
                   WHEN 'Bob'     THEN 2
                   WHEN 'Charlie' THEN 3
                   WHEN 'Dan'     THEN 4
                   END;

PS. For this names set ORDER BY name is enough.


How does this handle the requirement of repeating some results? – Willem Renzema

If you need in repeating then you must convert the list to a rowset.

SELECT table.*
FROM table
JOIN ( SELECT 1 pos, 'Alice'   name UNION ALL
       SELECT 2    , 'Bob'          UNION ALL
       SELECT 3    , 'Charlie'      UNION ALL
       SELECT 4    , 'Bob'          UNION ALL
       SELECT 5    , 'Charlie' ) names USING (name)
ORDER BY names.pos

CodePudding user response:

Somehow you have to construct the list of names with the order_position of each name.
You can do this in a query which uses UNION ALL to preserve duplicate names like this:

SELECT 'Alice' name, 1 order_position UNION ALL
SELECT 'Bob', 2 UNION ALL
SELECT 'Charlie', 3 UNION ALL
SELECT 'Dan', 4 UNION ALL
SELECT 'Alice', 1 UNION ALL
SELECT 'Bob', 2 UNION ALL
...............................

Then all you have to do is join it to the table:

SELECT t.* 
FROM tablename t
INNER JOIN (
  SELECT 'Alice' name, 1 order_position UNION ALL
  SELECT 'Bob', 2 UNION ALL
  SELECT 'Charlie', 3 UNION ALL
  SELECT 'Dan', 4 UNION ALL
  SELECT 'Alice', 1 UNION ALL
  SELECT 'Bob', 2 UNION ALL
  ...............................
) n ON n.name = t.name
ORDER BY n.order_position;

In MySql 8.0 you can use a CTE:

WITH cte(name, order_position) AS (VALUES 
  ROW('Alice', 1), ROW('Bob', 2), ROW('Charlie', 3), 
  ROW('Dan', 4), ROW('Alice', 1), ROW('Bob', 2),
  ...................................................
)
SELECT t.* 
FROM tablename t INNER JOIN cte c 
ON c.name = t.name
ORDER BY c.order_position;

CodePudding user response:

The following will be "fast" if name is indexed:

WHERE name IN             ('Alice', 'Bob', 'Charlie', 'Dan')
ORDER BY FIND_IN_SET(name, 'Alice,Bob,Charlie,Dan')

Note the syntax difference between Where and Order.

The following is likely to be somewhat slower because it cannot use any index but is simpler to code:

WHERE    FIND_IN_SET(name, 'Alice,Bob,Charlie,Dan')
ORDER BY FIND_IN_SET(name, 'Alice,Bob,Charlie,Dan')

Note the restriction in FIND_IN_SET that commas cannot be used in the items.

In no case will CASE or FIND_IN_SET() use an index. (Cf "sargable")

  • Related