I have two tables. search has 89000 records and email_leads has 26000.
I would like to correlate which email and name has performed the search by left joining on the base_id column from the search table.
However when I include the LEFT JOIN in the query it takes upwards of 7 minutes plus. When i remove it, the query executes instantly. Is there any way to restructure the query so that it doesn't take that long. The explain output makes me think mysql is actually joining all the rows it is scanning from the search table with the email_leads table instead of just the 10 records i am looking to return from the search table.
Here is the query i am running:
SELECT
sh.base_id,
sh.client_id,
sh.app_id,
sh.result_cnt,
sh.search_type,
sh.min_price,
sh.max_price,
sh.bedrooms,
sh.bathrooms,
sh.neighborhoods,
sh.office_connector,
sh.created,
any_value(el.from_email) as email,
any_value(el.from_name) as name
FROM search AS sh
LEFT JOIN email_leads AS el ON(sh.base_id = el.base_id)
WHERE date(sh.created) >= '2022-04-27'
AND date(sh.created) <= '2022-05-27'
GROUP BY sh.app_id, sh.base_id, sh.client_id, sh.result_cnt, sh.search_type, sh.min_price, sh.max_price,
sh.bedrooms, sh.bathrooms, sh.neighborhoods, sh.office_connector, sh.created ORDER BY sh.created DESC LIMIT 0, 10
Here are the indexes for both tables:
mysql> show index from search;
-------- ------------ ------------------------ -------------- ------------------ ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- -------------------------
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
-------- ------------ ------------------------ -------------- ------------------ ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- -------------------------
| search | 0 | PRIMARY | 1 | id | A | 87659 | NULL | NULL | | BTREE | | | YES | NULL |
| search | 1 | app_id_index | 1 | app_id | A | 5 | NULL | NULL | | BTREE | | | YES | NULL |
| search | 1 | search_type_index | 1 | search_type | A | 3 | NULL | NULL | YES | BTREE | | | YES | NULL |
| search | 1 | office_connector_index | 1 | office_connector | A | 6 | NULL | NULL | YES | BTREE | | | YES | NULL |
| search | 1 | bedrooms_index | 1 | bedrooms | A | 53 | NULL | NULL | YES | BTREE | | | YES | NULL |
| search | 1 | bathrooms_index | 1 | bathrooms | A | 5 | NULL | NULL | YES | BTREE | | | YES | NULL |
| search | 1 | min_price_index | 1 | min_price | A | 33 | NULL | NULL | YES | BTREE | | | YES | NULL |
| search | 1 | max_price_index | 1 | max_price | A | 51 | NULL | NULL | YES | BTREE | | | YES | NULL |
| search | 1 | base_id_index | 1 | base_id | A | 5474 | NULL | NULL | | BTREE | | | YES | NULL |
| search | 1 | client_id_index | 1 | client_id | A | 18 | NULL | NULL | | BTREE | | | YES | NULL |
| search | 1 | created_to_date | 1 | NULL | A | 68 | NULL | NULL | YES | BTREE | | | YES | cast(`created` as date) |
-------- ------------ ------------------------ -------------- ------------------ ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- -------------------------
11 rows in set (0.02 sec)
mysql> show index from email_leads;
------------- ------------ ------------------ -------------- ------------- ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- ------------
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
------------- ------------ ------------------ -------------- ------------- ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- ------------
| email_leads | 0 | PRIMARY | 1 | id | A | 21340 | NULL | NULL | | BTREE | | | YES | NULL |
| email_leads | 1 | from_email_index | 1 | from_email | A | 5211 | NULL | NULL | | BTREE | | | YES | NULL |
| email_leads | 1 | base_id_index | 1 | base_id | A | 1482 | NULL | NULL | | BTREE | | | YES | NULL |
| email_leads | 1 | client_id_index | 1 | client_id | A | 3 | NULL | NULL | | BTREE | | | YES | NULL |
------------- ------------ ------------------ -------------- ------------- ----------- ------------- ---------- -------- ------ ------------ --------- --------------- --------- ------------
4 rows in set (0.00 sec)
The explain output for the query:
---- ------------- ------- ------------ ------ ----------------- --------------- --------- ---------------------- ------- ---------- ----------------------------------------------
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
---- ------------- ------- ------------ ------ ----------------- --------------- --------- ---------------------- ------- ---------- ----------------------------------------------
| 1 | SIMPLE | sh | NULL | ALL | created_to_date | NULL | NULL | NULL | 87659 | 50.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | el | NULL | ref | base_id_index | base_id_index | 98 | srmanager.sh.base_id | 14 | 100.00 | NULL |
---- ------------- ------- ------------ ------ ----------------- --------------- --------- ---------------------- ------- ---------- ----------------------------------------------
CodePudding user response:
You can give sub query a try instead of left join.
SELECT
sh.base_id,
sh.client_id,
sh.app_id,
sh.result_cnt,
sh.search_type,
sh.min_price,
sh.max_price,
sh.bedrooms,
sh.bathrooms,
sh.neighborhoods,
sh.office_connector,
sh.created,
(select el.from_email from email_leads AS el where sh.base_id = el.base_id limit 1) as email,
(select el.from_name from email_leads AS el where sh.base_id = el.base_id limit 1) as name
FROM search AS sh
WHERE date(sh.created) >= '2022-04-27'
AND date(sh.created) <= '2022-05-27'
GROUP BY sh.app_id, sh.base_id, sh.client_id, sh.result_cnt, sh.search_type, sh.min_price, sh.max_price,
sh.bedrooms, sh.bathrooms, sh.neighborhoods, sh.office_connector, sh.created ORDER BY sh.created DESC LIMIT 0, 10
CodePudding user response:
You use of DATE(sh.created)
prevents using the index on that column when you put the column into a function. So it must do a table-scan of 89k rows. This is shown by type: ALL
and rows: 87659
in the EXPLAIN.
Instead, try this:
...
WHERE sh.created >= '2022-04-27'
sh.created < '2022-05-28'
...
Notice I made the latter condition strictly less-than (<
) and advanced the date by one day. This is to allow the dates to match even if the datetime is up to 23:59:59.