SQL How to take the minium for multiple fields?-CodePudding

Consider the following data set that records the product sold, year, and revenue from that particular product in thousands of dollars. This data table (YEARLY_PRODUCT_REVENUE) is stored in SQL and has many more rows.

Year | Product | Revenue 
2000   Table     100  
2000   Chair     200
2000   Bed       150
2010   Table     120  
2010   Chair     190
2010   Bed       390

Using SQL, for every year I would like to find the product that has the maximum revenue. That is, I would like my output to be the following:

Year | Product | Revenue  
2000   Chair     200
2010   Bed       390

My attempt so far has been this:

SELECT year, product, MIN(revenue)
FROM YEARLY_PRODUCT_REVENUE
GROUP BY article, month;

But when I do this, I get multiple-year values for distinct products. For instance, I'm getting the output below which is an error. I'm not entirely sure what the error here is. Any help would be much appreciated!

Year | Product | Revenue 
2000   Table     100  
2000   Bed       150
2010   Table     120  
2010   Chair     190

CodePudding user response：

You don't mention the database so I'll assume it's PostgreSQL. You can do:

select distinct on (year) * from t order by year, revenue desc

CodePudding user response：

You want filtering rather than aggregation. We can use window functions (which most databases support) to rank yearly product sales, and then retain only the top selling product per year.

select *
from (
    select r.*, rank() over(partition by year order by revenue desc) rn 
    from yearly_product_revenue r
) r
where rn = 1;

Here is a shorter solution if your database support the standard WITH TIES clause:

select *
from yearly_product_revenue r
order by rank() over(partition by year order by revenue desc) 
fetch first row with ties