I am trying to come up with a query that will return the aggregate data for the earliest orders the customers have placed. Is it possible to do this when there are multiple rows/orders that were placed on the earliest purchase date for a given customer? For instance, I have these tables:
customers
id | name | created_at |
---|---|---|
1 | Sam | 2019-07-12 |
2 | Jimmy | 2019-01-22 |
items
id | name | price |
---|---|---|
1 | Watch | 200 |
2 | Belt | 75 |
3 | Wallet | 150 |
orders
id | customer_id | item_id | created_at |
---|---|---|---|
1 | 1 | 1 | 2018-08-01 |
2 | 1 | 2 | 2018-08-11 |
3 | 2 | 1 | 2019-01-22 |
4 | 2 | 3 | 2019-01-22 |
5 | 2 | 2 | 2019-03-03 |
expected query
customer_id | name | first_purchase_date | n_items | total_price |
---|---|---|---|---|
1 | Sam | 2018-08-01 | 1 | 200 |
2 | Jimmy | 2019-01-22 | 2 | 350 |
I currently have the following query set up, but this query is grouping by the customer_id such that the total number of items and total price do not reflect the earliest orders.
SELECT
orders.customer_id,
customers.name AS name,
MIN(orders.created_at) AS first_purchase_date,
COUNT(*) as n_items,
SUM(items.price) as total_price
FROM orders
INNER JOIN customers
ON orders.customer_id = customers.id
INNER JOIN items
ON orders.item_id = items.id
GROUP BY
customers.id
my incorrect query
customer_id | name | first_purchase_date | n_items | total_price |
---|---|---|---|---|
1 | Sam | 2018-08-01 | 2 | 275 |
2 | Jimmy | 2019-01-22 | 3 | 425 |
Would greatly appreciate any help. Thanks!
CodePudding user response:
Explanation:
cte
will give you the first_purchase_date for each customer id and name- using that table to find aggregations that matches the customer id and the first_purchase_date
See db<>fiddle
WITH cte AS (
SELECT
orders.customer_id,
customers.name,
MIN(orders.created_at) AS first_purchase_date
FROM orders
INNER JOIN customers
ON orders.customer_id = customers.id
INNER JOIN items
ON orders.item_id = items.id
GROUP BY
orders.customer_id,
customers.name
)
SELECT
cte.customer_id,
cte.name,
cte.first_purchase_date,
COUNT(orders.item_id) AS n_items,
SUM(items.price) AS total_price
FROM cte
INNER JOIN orders
ON cte.customer_id = orders.customer_id
AND cte.first_purchase_date = orders.created_at
INNER JOIN items
ON orders.item_id = items.id
GROUP BY
cte.customer_id,
cte.name,
cte.first_purchase_date
;
CodePudding user response:
select customer_id
,name
,created_at as first_purchase_date
,count(*) as n_items
,sum(price) as total_price
from (
select o.customer_id
,c.name
,o.created_at
,i.price
,rank() over(partition by o.customer_id order by o.created_at) as rn
from orders o join items i on i.id = o.item_id join customers c on c.id = o.customer_id
) t
where rn = 1
group by customer_id, name, created_at
customer_id | name | first_purchase_date | n_items | total_price |
---|---|---|---|---|
1 | Sam | 2018-08-01 00:00:00 | 1 | 200 |
2 | Jimmy | 2019-01-22 00:00:00 | 2 | 350 |