Home > database >  For SQL, how to use group by twice but only order by sum() within the 2nd group
For SQL, how to use group by twice but only order by sum() within the 2nd group

Time:10-28

The table records credit card transactions where each row is one record.

The columns are: transaction_id, customerID, dollar_spent, product_category.

How can I pick up the 3 customerID's from each product_category that have the highest dollar_spent within that category?

I was thinking about something like:

select product_category, customerID, sum(dollar_spent) 
from transaction
group by product_category, customerID
order by sum(dollar_spent) desc limit 3

but it failed to pass. Removing "limit 3" helped it to pass but the whole result is sorted solely by sum(dollar_spent), not by sum(dollar_spent) within each product_category.

Searched on StackOverflow but didn't find anything relevant. Could someone help me with this? Many thanks!!

CodePudding user response:

What I think you are looking for is a windowed function.

I added the row_number function to your select ordered by the amount of money spent, this basically ranks customer spending by product.

Then since you can not use a windowed function in a where clause (unfortunately) i had to create an outer query to filter on the row_number results

select * from( 
   select product_category, customerID, sum(dollar_spent), 
   row_number() over (partition by product_category order by sum(dollar_spent) desc) as rowNum
   from transaction
   group by product_category, customerID
) as data
where rowNum <= 3
order by product_category, rowNum

here is a fiddle

  • Related