For example I have a Product table with count, but I would like to display only the values of the top 3 products based on the sum of their count.
Product | Date | Value |
---|---|---|
Product 1 | 2022-12-01 | 200 |
Product 1 | 2022-12-02 | 200 |
Product 2 | 2022-12-01 | 200 |
Product 2 | 2022-12-03 | 500 |
Product 3 | 2022-12-04 | 300 |
Product 3 | 2022-12-08 | 600 |
Product 4 | 2022-12-01 | 100 |
Product 4 | 2022-12-03 | 100 |
Product 5 | 2022-12-01 | 700 |
Product 5 | 2022-12-10 | 800 |
Based on the sample above, the sum of each product would be: Product 1 - 400 Product 2 - 700 Product 3 - 900 Product 4 - 200 Product 5 - 1,500
And I would like to display only the values of the top 3 products (Products 5, 3, and 2).
Product | Date | Value |
---|---|---|
Product 2 | 2022-12-01 | 200 |
Product 2 | 2022-12-03 | 500 |
Product 3 | 2022-12-04 | 300 |
Product 3 | 2022-12-08 | 600 |
Product 5 | 2022-12-01 | 700 |
Product 5 | 2022-12-10 | 800 |
I used to check first the product with the highest sum of count so I could use the result as a filter on my table. But I'd like to use 1 SQL query only instead of running 2 separate queries.
SELECT product, count(value) as prod_count
FROM product
GROUP BY product
ORDER BY prod_count
LIMIT 3
CodePudding user response:
We can SUM
(instead of COUNT
) the value and GROUP BY
the product.
Here we can use FETCH FIRST 3 ROWS WITH TIES
to find for example two products having the identic 3rd highest sum.
So the entire query will be this one:
SELECT product, date, value
FROM product
WHERE product IN
(SELECT product
FROM product
GROUP BY product
ORDER BY SUM(value) DESC
FETCH FIRST 3 ROWS WITH TIES)
ORDER BY product, date;
We should mention the column/table naming should be improved if possible because having the same table name and column name "product" causes bad readability.
Furthermore the column "date" (wich is actually a SQL key word) should better be renamed to something more meaningful like for example "sellDate", same for the column "value".
Anyway, let's assume there is another product "product 6" which has the same sumed value (700) as product 2.
Then the above query will produce this outcome:
Product | Date | Value |
---|---|---|
Product 2 | 2022-12-01 | 200 |
Product 2 | 2022-12-03 | 500 |
Product 3 | 2022-12-04 | 300 |
Product 3 | 2022-12-08 | 600 |
Product 5 | 2022-12-01 | 700 |
Product 5 | 2022-12-10 | 800 |
Product 6 | 2022-12-01 | 600 |
Product 6 | 2022-12-10 | 100 |
If it's not intended to show four products (or more if more have the same 3rd highest sum of value), we can just use LIMIT 3
instead:
SELECT product, date, value
FROM product
WHERE product IN
(SELECT product
FROM product
GROUP BY product
ORDER BY SUM(value) DESC
LIMIT 3)
ORDER BY product, date;
So we will get only three products again, one of those having the sumed value 700 (here product 2) will not be selected.
So the result of this query would be this:
Product | Date | Value |
---|---|---|
Product 3 | 2022-12-04 | 300 |
Product 3 | 2022-12-08 | 600 |
Product 5 | 2022-12-01 | 700 |
Product 5 | 2022-12-10 | 800 |
Product 6 | 2022-12-01 | 600 |
Product 6 | 2022-12-10 | 100 |
Or if we even want to say product 2 should be found instead of product 6, we can add the product to the ORDER BY
clause:
SELECT product, date, value
FROM product
WHERE product IN
(SELECT product
FROM product
GROUP BY product
ORDER BY SUM(value) DESC, product
LIMIT 3)
ORDER BY product, date;
This will be the outcome for this query:
Product | Date | Value |
---|---|---|
Product 2 | 2022-12-01 | 200 |
Product 2 | 2022-12-03 | 500 |
Product 3 | 2022-12-04 | 300 |
Product 3 | 2022-12-08 | 600 |
Product 5 | 2022-12-01 | 700 |
Product 5 | 2022-12-10 | 800 |
We can try out here: db<>fiddle
CodePudding user response:
You could use DENSE_RANK
function within an aggregated query joined to your table as the following:
Select P.Product, P.Date, P.Value
From Product P Join
(
Select Product,
DENSE_RANK() Over (Order By Sum(Value) Desc) rn
From Product
Group By Product
) T
On P.Product = T.Product
Where T.rn <= 3
Order By P.Product, P.Date
See a demo.