I would like to write a SQL query to retrieve all transactions (DOCUMENT_NUM) that contain at least two products from a selection list ("A", "B", "C")
select TRANSACTION_DAY, DOCUMENT_NUM, PRODUCT_CODE, PRODUCT_DSCR, QTY_SOLD, NET_SALES
from main.sales
where PRODUCT_CODE in ("A", "B", "C")
If I use the IN operator, the query may return transactions with one product from the list, but I want at least two.
Is it possible to achieve that?
CodePudding user response:
Yes, it is possible to achieve that using group by
and count
select distinct DOCUMENT_NUM
from main.sales
where PRODUCT_CODE in ("A", "B", "C")
group by DOCUMENT_NUM
having count(distinct PRODUCT_CODE)>=2
If you need all the columns, you can do a nested query
select TRANSACTION_DAY, DOCUMENT_NUM, PRODUCT_CODE, PRODUCT_DSCR, QTY_SOLD, NET_SALES
from main.sales
where DOCUMENT_NUM in (
select distinct DOCUMENT_NUM
from main.sales
where PRODUCT_CODE in ("A", "B", "C")
group by DOCUMENT_NUM
having count(distinct PRODUCT_CODE)>=2
)
and PRODUCT_CODE in ("A", "B", "C")
CodePudding user response:
Alternatively, consider approx_count_distinct
(faster than COUNT
DISTINCT
for larger datasets) to filter DOCUMENT_NUM
by distinct PRODUCT_CODE
:
select TRANSACTION_DAY, DOCUMENT_NUM,
PRODUCT_CODE, PRODUCT_DSCR,
QTY_SOLD, NET_SALES
from main.sales
where DOCUMENT_NUM IN (
select DOCUMENT_NUM
from main.sales
group by DOCUMENT_NUM
having approx_count_distinct(PRODUCT_CODE)
filter(where PRODUCT_CODE in ("A", "B", "C")) >= 2
)
CodePudding user response:
If you want to include transactions that have at least two of the same products as well, see code below.
CREATE TABLE test (
id INT
, cat CHAR)
INSERT INTO test(id, cat)
VALUES (1, 'a')
, (2, 'a')
, (3, 'a')
, (1, 'b')
, (2, 'a')
, (3, 'c')
SELECT id
, COUNT(id)
FROM test
WHERE cat IN ('a', 'b')
GROUP BY id
HAVING COUNT(id) > 1
If you only want transactions that have 2 or more unique products use below code to replace last block of code above:
SELECT id
, COUNT(id)
FROM (SELECT DISTINCT * FROM test) t
WHERE cat in ('a', 'b')
GROUP BY id
HAVING COUNT(id) > 1