Home > Blockchain >  How to use correct indexes with a double inner join query?
How to use correct indexes with a double inner join query?

Time:07-09

I have a query with 2 INNER JOIN statements, and only fetching a few column, but it is very slow even though I have indexes on all required columns.

My query

SELECT 
  dysfonctionnement, 
  montant, 
  listRembArticles, 
  case when dys.reimputation is not null then dys.reimputation else dys.responsable end as responsable_final
FROM 
  db.commandes AS com
  INNER JOIN db.dysfonctionnements AS dys ON com.id_commande = dys.id_commande
  INNER JOIN db.pe AS pe ON com.code_pe = pe.pe_id
WHERE 
  com.prestataireLAD REGEXP '.*' 
  AND pe_nom REGEXP 'bordeaux|chambéry-annecy|grenoble|lyon|marseille|metz|montpellier|nancy|nice|nimes|rouen|strasbourg|toulon|toulouse|vitry|vitry bis 1|vitry bis 2|vlg' 
  AND com.date_livraison BETWEEN '2022-06-11 00:00:00' 
  AND '2022-07-08 00:00:00';

It takes around 20 seconds to compute and fetch 4123 rows.

The problem

In order to find what's wrong and why is it so slow, I've used the EXPLAIN statement, here is the output:

| id | select_type | table | partitions | type   | possible_keys              | key         | key_len | ref                    | rows   | filtered | Extra       |
|----|-------------|-------|------------|--------|----------------------------|-------------|---------|------------------------|--------|----------|-------------|
|  1 | SIMPLE      | dys   |            | ALL    | id_commande,id_commande_2  |             |         |                        | 878588 |   100.00 | Using where |
|  1 | SIMPLE      | com   |            | eq_ref | id_commande,date_livraison | id_commande | 110     | db.dys.id_commande     |      1 |     7.14 | Using where |
|  1 | SIMPLE      | pe    |            | ref    | pe_id                      | pe_id       | 5       | db.com.code_pe         |      1 |   100.00 | Using where |

I can see that the dysfonctionnements JOIN is rigged, and doesn't use a key even though it could...

Table definitions

commandes (included relevant columns only)

CREATE TABLE `commandes` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `id_commande` varchar(36) NOT NULL DEFAULT '',
  `date_commande` datetime NOT NULL,
  `date_livraison` datetime NOT NULL,
  `code_pe` int(11) NOT NULL,
  `traitement_dysfonctionnement` tinyint(4) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_commande` (`id_commande`),
  KEY `date_livraison` (`date_livraison`),
  KEY `traitement_dysfonctionnement` (`traitement_dysfonctionnement`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

dysfonctionnements (again, relevant columns only)

CREATE TABLE `dysfonctionnements` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `id_commande` varchar(36) DEFAULT NULL,
  `dysfonctionnement` varchar(150) DEFAULT NULL,
  `responsable` varchar(50) DEFAULT NULL,
  `reimputation` varchar(50) DEFAULT NULL,
  `montant` float DEFAULT NULL,
  `listRembArticles` text,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_commande` (`id_commande`,`dysfonctionnement`),
  KEY `id_commande_2` (`id_commande`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

pe (again, relevant columns only)

CREATE TABLE `pe` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `pe_id` int(11) DEFAULT NULL,
  `pe_nom` varchar(30) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `pe_nom` (`pe_nom`),
  KEY `pe_id` (`pe_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Investigation

If I remove the db.pe table from the query and the WHERE clause on pe_nom, the query takes 1.7 seconds to fetch 7k rows, and with the EXPLAIN statement, I can see it is using keys as I expect it to do:

| id | select_type | table | partitions | type  | possible_keys              | key            | key_len | ref                    | rows   | filtered | Extra                                         |
|----|-------------|-------|------------|-------|----------------------------|----------------|---------|------------------------|--------|----------|-----------------------------------------------|
|  1 | SIMPLE      | com   |            | range | id_commande,date_livraison | date_livraison | 5       |                        | 389558 |   100.00 | Using index condition; Using where; Using MRR |
|  1 | SIMPLE      | dys   |            | ref   | id_commande,id_commande_2  | id_commande_2  | 111     | ooshop.com.id_commande |      1 |   100.00 |                                               |

I'm open to any suggestions, I see no reason not to use the key when it does on a very similar query and it definitely makes it faster...

CodePudding user response:

I had a similar experience when MySQL optimiser selected a joined table sequence far from optimal. At that time I used MySQL specific STRAIGHT_JOIN operator to overcome default optimiser behaviour. In your case I would try this:

SELECT 
  dysfonctionnement, 
  montant, 
  listRembArticles, 
  case when dys.reimputation is not null then dys.reimputation else dys.responsable end as responsable_final
FROM 
  db.commandes AS com
  STRAIGHT_JOIN db.dysfonctionnements AS dys ON com.id_commande = dys.id_commande
  INNER JOIN db.pe AS pe ON com.code_pe = pe.pe_id

Also, in your WHERE clause one of the REGEXP probably might be changed to IN operator, I assume it can use index.

CodePudding user response:

Remove com.prestataireLAD REGEXP '.*'. The Optimizer probably won't realize that this has no impact on the resultset. If you are dynamically building the WHERE clause, then eliminate anything else you can.

id_commande_2 is redundant. In queries where it might be useful, the UNIQUE can take care of it.

These indexes might help:

com:  INDEX(date_livraison, id_commande,  code_pe)
pe:  INDEX(pe_nom, pe_id)
  • Related