Home > Software design >  MySQL Inner join and subqueries takes a long time
MySQL Inner join and subqueries takes a long time

Time:11-15

I am trying to get a result from the following query but the calculation is taking a very long time

How can I improve it ?

Please help

SELECT 
    ic.id, ic.id_number, ia.phone_number 
FROM 
    in_client AS ic 
INNER JOIN (
    SELECT id_number, MAX(modified) AS modified 
    FROM in_client 
    WHERE 
        id_number IS NOT NULL AND 
        agent_id = 1234
    GROUP BY id_number 
    HAVING MAX(modified) >= '2021-mm-dd hh:mm:ss'
) AS max USING (id_number, modified) 
INNER JOIN 
    in_agent AS ia ON (ia.id = ic.agent_id) 
LEFT OUTER JOIN 
    in_policy AS ip ON (ip.client_id = ic.id) 
WHERE 
    ip.client_id IS NULL

SCHEMAS:

in_client

id - PK (INT)
agent_id - FK (INT)
id_number - VARCHAR
modified - DATETIME

in_agent

id - PK (INT)
name - VARCHAR

in_policy

id - PK (INT)
client_id - FK (INT)

Note: there are more other fields for each table but they are not relevant for this query

As general IDEA

This query takes all clients from a client table under the same agent who do not have files in the policies table.

Each client can be created multiple times so I used GROUP BY For each group of clients and then I take the most updated client (Modified).

Each client has associated with agent (FK) - so there is an INNER JOIN for perfect match within the sub-querie. And in order to get all clients who do not have files in the policy table, I use LEFT OUTER JOIN

Query Explain:

Query Explain

CodePudding user response:

In in_client, try adding the following indexes:

  INDEX idx1_client (id_number, modified)
, INDEX idx2_client (agent_id, id_number, modified)

These indexes were chosen to satisfy requirements of the query expression inside the max derived table, to allow better access to (agent_id, id_number and modified) and for the join to in_client USING (id_number, modified). This was just a first guess without the full schema for these tables.

I used both MariaDB and MySQL for testing, and newer versions to allow creation of the test data dynamically recursively.

The simple test case created to evaluate the problem:

The fiddle (runs with MariaDB 10.3 and MySQL 8 )

CREATE TABLE in_agent (
   id           INT primary key auto_increment
 , name         VARCHAR(30)
 , phone_number varchar(20)
);

CREATE TABLE in_client (
   id        INT primary key auto_increment
 , agent_id  INT  references in_agent (id)
 , id_number VARCHAR(30)
 , modified  DATETIME DEFAULT current_timestamp
 , INDEX idx1 (id_number, modified)
 , INDEX idx2 (agent_id, id_number, modified)
);

CREATE TABLE in_policy (
   id        INT primary key auto_increment
 , client_id INT references in_client (id)
);

INSERT INTO in_agent (name)
WITH RECURSIVE cte1 (lev) AS (
      SELECT 1 UNION ALL
      SELECT lev   1 FROM cte1 WHERE lev < 100
     )
SELECT CONCAT('name', lev) FROM cte1
;

INSERT INTO in_client (agent_id, id_number)
WITH RECURSIVE cte1 (lev) AS (
      SELECT 1 UNION ALL
      SELECT lev   1 FROM cte1 WHERE lev < 100
     )
SELECT TRUNCATE((lev-1)/50,0) 1, CONCAT('idnum', lev) FROM cte1
;

SELECT COUNT(*) FROM in_agent;
SELECT COUNT(*) FROM in_client;

SELECT ic.id, ic.id_number, ia.phone_number 
  FROM in_client AS ic 
  JOIN (
    SELECT id_number, MAX(modified) AS modified 
      FROM in_client 
     WHERE id_number IS NOT NULL
       AND agent_id = 1
     GROUP BY id_number 
    HAVING MAX(modified) >= '2021-01-01 01:00:00'
     ) AS max USING (id_number, modified) 
  JOIN      in_agent  AS ia ON (ia.id = ic.agent_id) 
  LEFT JOIN in_policy AS ip ON (ip.client_id = ic.id) 
 WHERE ip.client_id IS NULL
;

EXPLAIN
SELECT ic.id, ic.id_number, ia.phone_number 
  FROM in_client AS ic 
  JOIN (
    SELECT id_number, MAX(modified) AS modified 
      FROM in_client 
     WHERE id_number IS NOT NULL
       AND agent_id = 1
     GROUP BY id_number 
    HAVING MAX(modified) >= '2021-01-01 01:00:00'
     ) AS max USING (id_number, modified) 
  JOIN      in_agent  AS ia ON (ia.id = ic.agent_id) 
  LEFT JOIN in_policy AS ip ON (ip.client_id = ic.id) 
 WHERE ip.client_id IS NULL
;

The indexes chosen seem to be better than those in the original question detail.

This should help.

  • Related