I'm working on a large e-commerce website. I have a SQL query that retrieves all the products that a customer has bought in the last 6 months, but the query is taking a long time to run.
SELECT products.*
FROM products
INNER JOIN orders ON products.product_id = orders.product_id
INNER JOIN customers ON orders.customer_id = customers.customer_id
WHERE customers.customer_id = [customer_id]
AND orders.order_date BETWEEN [start_date] AND [end_date];
The table structure for relevant tables is:
- products table has 1 million records and has the following columns: product_id, product_name, product_price, product_category.
- orders table has 10 million records and has the following columns: order_id, product_id, customer_id, order_date.
- customers table has 100,000 records and has the following columns: customer_id, customer_name, customer_email.
I have created the following indexes:
- products table has an index on product_id column.
- orders table has an index on product_id, customer_id, and order_date columns.
- customers table has an index on customer_id column.
- I expect the query to return all the products that a customer has bought in the last 6 months, but it's taking a long time to run. Can anyone help me optimize this query?
CodePudding user response:
The obvious optimization that you can do is to not join customers
because you have customer_id
in orders
:
SELECT p.*
FROM products p INNER JOIN orders o
ON p.product_id = o.product_id
WHERE o.customer_id = ? AND o.order_date >= CURRENT_DATE - INTERVAL 6 MONTH;
For this query you need a composite index:
CREATE INDEX idx_orders ON orders(customer_id, order_date);
CodePudding user response:
You could leave out the customer table.
SELECT p.product_id
, p.product_name
, p.product_price
, p.product_category
FROM product p
INNER JOIN orders o ON p.product_id = o.product_id
AND o.customer_id = [customer_id]
AND o.order_date BETWEEN [start_date] AND [end_date]