Home > Software engineering >  Find purchase if same item on different days
Find purchase if same item on different days

Time:10-29

I'm trying to find customers that bought the same item more than once in different days. I got it partially working. I can't get the customer first/last name and item_name without adding it to the group by clause. In addition, I want to include a count if how many times the same uten was purchased on different days.

I suspect that group by is probably not the best solution. Would this be better solved using a self JOIN or perhaps a lead?

CREATE TABLE customers 
(CUSTOMER_ID, FIRST_NAME, LAST_NAME) AS
SELECT 1, 'Abby', 'Katz' FROM DUAL UNION ALL
SELECT 2, 'Lisa', 'Saladino' FROM DUAL UNION ALL
SELECT 3, 'Jerry', 'Torchiano' FROM DUAL;

CREATE TABLE items 
(PRODUCT_ID, PRODUCT_NAME) AS
SELECT 100, 'Black Shoes' FROM DUAL UNION ALL
SELECT 101, 'Brown Shoes' FROM DUAL UNION ALL
SELECT 102, 'White Shoes' FROM DUAL;

CREATE TABLE purchases
(CUSTOMER_ID, PRODUCT_ID, QUANTITY, PURCHASE_DATE) AS
SELECT 1, 100, 1, TIMESTAMP'2022-10-11 09:54:48' FROM DUAL UNION ALL
SELECT 1, 100, 1, TIMESTAMP '2022-10-11 19:04:18' FROM DUAL UNION ALL
SELECT 2, 101,1, TIMESTAMP '2022-10-11 09:54:48' FROM DUAL UNION ALL
SELECT 2,101,1, TIMESTAMP '2022-10-17 19:04:18' FROM DUAL UNION ALL
SELECT 3, 101,1, TIMESTAMP '2022-10-11 09:54:48' FROM DUAL UNION ALL
SELECT 3,102,1, TIMESTAMP '2022-10-17 19:04:18' FROM DUAL;

With CTE as (
 SELECT customer_id
                ,product_id 
                ,trunc(purchase_date)
  FROM purchases
  GROUP BY customer_id 
                    ,product_id
                   ,trunc(purchase_date)
)
SELECT  customer_id, product_id
FROM CTE
GROUP BY customer_id                  ,product_id 
HAVING COUNT(1)>1

CodePudding user response:

I would use exists logic here:

SELECT DISTINCT c.first_name, c.last_name
FROM customers c
INNER JOIN purchases p
    ON p.customer_id = c.customer_id
WHERE EXISTS (
    SELECT 1
    FROM purchases p2
    WHERE p2.customer_id = p.customer_id AND
          p2.product_id = p.product_id   AND
          TRUNC(p2.purchase_date) <> TRUNC(p.purchase_date)
);

In plain English, the above query says to find all customers who bought the same product but on different dates.

CodePudding user response:

This might be one option: use count function in its analytic form and the fetch rows where that count is larger than 1; according to data you posted, it is Lisa who bought brown shoes on two different dates.

SQL> WITH
  2     temp
  3     AS
  4        (  SELECT c.first_name,
  5                  i.product_name,
  6                  TRUNC (p.purchase_date),
  7                  COUNT (*) OVER (PARTITION BY c.first_name, i.product_name) cnt
  8             FROM purchases p
  9                  JOIN customers c ON c.customer_id = p.customer_id
 10                  JOIN items i ON i.product_id = p.product_id
 11         GROUP BY c.first_name, i.product_name, TRUNC (p.purchase_date))
 12  SELECT DISTINCT first_name, product_name, cnt
 13    FROM temp
 14   WHERE cnt > 1;

FIRST PRODUCT_NAM        CNT
----- ----------- ----------
Lisa  Brown Shoes          2

SQL>

CodePudding user response:

You do not need neither "self join" or "not exists". And even analytic count function. All you need is only one group by.

with customers (CUSTOMER_ID, FIRST_NAME, LAST_NAME) AS (
SELECT 1, 'Abby', 'Katz' FROM DUAL UNION ALL
SELECT 2, 'Lisa', 'Saladino' FROM DUAL UNION ALL
SELECT 3, 'Jerry', 'Torchiano' FROM DUAL),
items (product_id, product_name) as (
SELECT 100, 'Black Shoes' FROM DUAL UNION ALL
SELECT 101, 'Brown Shoes' FROM DUAL UNION ALL
SELECT 102, 'White Shoes' FROM DUAL),
purchases (CUSTOMER_ID, PRODUCT_ID, QUANTITY, PURCHASE_DATE) AS (
SELECT 1, 100, 1, TIMESTAMP'2022-10-11 09:54:48' FROM DUAL UNION ALL
SELECT 1, 100, 1, TIMESTAMP '2022-10-11 19:04:18' FROM DUAL UNION ALL
SELECT 2, 101,1, TIMESTAMP '2022-10-11 09:54:48' FROM DUAL UNION ALL
SELECT 2,101,1, TIMESTAMP '2022-10-17 19:04:18' FROM DUAL UNION ALL
SELECT 3, 101,1, TIMESTAMP '2022-10-11 09:54:48' FROM DUAL UNION ALL
SELECT 3,102,1, TIMESTAMP '2022-10-17 19:04:18' FROM DUAL)
-- SQL Query
select p.product_id, p.customer_id, c.first_name, c.last_name, i.product_name, 
count(*) cnt_purchases, count(distinct trunc(purchase_date)) cnt_dist_days
from      purchases p
left join customers c on p.customer_id = c.customer_id
left join items     i on p.product_id  = i.product_id
group by p.product_id, p.customer_id, c.first_name, c.last_name, i.product_name
having count(distinct trunc(purchase_date)) > 1;

PRODUCT_ID CUSTOMER_ID FIRST_NAME LAST_NAME PRODUCT_NAME CNT_PURCHASES CNT_DIST_DAYS
---------- ----------- ---------- --------- ------------ ------------- -------------
       101           2 Lisa       Saladino  Brown Shoes              2             2

https://dbfiddle.uk/nqGdtxAY

  • Related