I am trying to get a result from the following query but the calculation is taking a very long time
How can I improve it ?
Please help
SELECT
ic.id, ic.id_number, ia.phone_number
FROM
in_client AS ic
INNER JOIN (
SELECT id_number, MAX(modified) AS modified
FROM in_client
WHERE
id_number IS NOT NULL AND
agent_id = 1234
GROUP BY id_number
HAVING MAX(modified) >= '2021-mm-dd hh:mm:ss'
) AS max USING (id_number, modified)
INNER JOIN
in_agent AS ia ON (ia.id = ic.agent_id)
LEFT OUTER JOIN
in_policy AS ip ON (ip.client_id = ic.id)
WHERE
ip.client_id IS NULL
SCHEMAS:
in_client
id - PK (INT)
agent_id - FK (INT)
id_number - VARCHAR
modified - DATETIME
in_agent
id - PK (INT)
name - VARCHAR
in_policy
id - PK (INT)
client_id - FK (INT)
Note: there are more other fields for each table but they are not relevant for this query
As general IDEA
This query takes all clients from a client table under the same agent who do not have files in the policies table.
Each client can be created multiple times so I used GROUP BY For each group of clients and then I take the most updated client (Modified).
Each client has associated with agent (FK) - so there is an INNER JOIN for perfect match within the sub-querie. And in order to get all clients who do not have files in the policy table, I use LEFT OUTER JOIN
Query Explain:
CodePudding user response:
In in_client
, try adding the following indexes:
INDEX idx1_client (id_number, modified)
, INDEX idx2_client (agent_id, id_number, modified)
These indexes were chosen to satisfy requirements of the query expression inside the max
derived table, to allow better access to (agent_id, id_number and modified)
and for the join to in_client USING (id_number, modified)
. This was just a first guess without the full schema for these tables.
I used both MariaDB and MySQL for testing, and newer versions to allow creation of the test data dynamically recursively.
The simple test case created to evaluate the problem:
The fiddle (runs with MariaDB 10.3 and MySQL 8 )
CREATE TABLE in_agent (
id INT primary key auto_increment
, name VARCHAR(30)
, phone_number varchar(20)
);
CREATE TABLE in_client (
id INT primary key auto_increment
, agent_id INT references in_agent (id)
, id_number VARCHAR(30)
, modified DATETIME DEFAULT current_timestamp
, INDEX idx1 (id_number, modified)
, INDEX idx2 (agent_id, id_number, modified)
);
CREATE TABLE in_policy (
id INT primary key auto_increment
, client_id INT references in_client (id)
);
INSERT INTO in_agent (name)
WITH RECURSIVE cte1 (lev) AS (
SELECT 1 UNION ALL
SELECT lev 1 FROM cte1 WHERE lev < 100
)
SELECT CONCAT('name', lev) FROM cte1
;
INSERT INTO in_client (agent_id, id_number)
WITH RECURSIVE cte1 (lev) AS (
SELECT 1 UNION ALL
SELECT lev 1 FROM cte1 WHERE lev < 100
)
SELECT TRUNCATE((lev-1)/50,0) 1, CONCAT('idnum', lev) FROM cte1
;
SELECT COUNT(*) FROM in_agent;
SELECT COUNT(*) FROM in_client;
SELECT ic.id, ic.id_number, ia.phone_number
FROM in_client AS ic
JOIN (
SELECT id_number, MAX(modified) AS modified
FROM in_client
WHERE id_number IS NOT NULL
AND agent_id = 1
GROUP BY id_number
HAVING MAX(modified) >= '2021-01-01 01:00:00'
) AS max USING (id_number, modified)
JOIN in_agent AS ia ON (ia.id = ic.agent_id)
LEFT JOIN in_policy AS ip ON (ip.client_id = ic.id)
WHERE ip.client_id IS NULL
;
EXPLAIN
SELECT ic.id, ic.id_number, ia.phone_number
FROM in_client AS ic
JOIN (
SELECT id_number, MAX(modified) AS modified
FROM in_client
WHERE id_number IS NOT NULL
AND agent_id = 1
GROUP BY id_number
HAVING MAX(modified) >= '2021-01-01 01:00:00'
) AS max USING (id_number, modified)
JOIN in_agent AS ia ON (ia.id = ic.agent_id)
LEFT JOIN in_policy AS ip ON (ip.client_id = ic.id)
WHERE ip.client_id IS NULL
;
The indexes chosen seem to be better than those in the original question detail.
This should help.