I am looking to write a query to pull the top selling product per category from a schema. The schema on a simplified view looks like this:
Category | Orderid | Revenue |
---|---|---|
Food | 12as | 234 |
Sport | 421bb | 3434 |
Steel | 35366cd | 12355 |
Food | 3421ww | 362 |
Sport | 546421qw | 436456 |
etc etc.
I am using amazon redshift. I want to find the distinct category, its top selling order ID and the sum of the revenue.
Select distinct category, orderid, sum(revenue) as rev from XXX
I've got the start of the query but not sure where to go from here.
CodePudding user response:
You can try to use ROW_NUMBER
window function, using PARTITION BY
with your grouping column then get the top-selling values.
SELECT Category,Orderid,Revenue
FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Category ORDER BY Revenue desc) rn
FROM XXX
) t1
WHERE rn = 1
CodePudding user response:
See if the following works for you - using Window Function to find the OrderId with the highest revenue, then aggregating:
select Category, Sum(Revenue) TotalRevenue, Max(TopSelling) TopSelling
from (
select *,
First_Value(OrderId) over(partition by category order by Revenue desc) TopSelling
from t
)t
group by Category;