I would like to find the total purchase for each customer then return the highest value by customer category.
For now, I'm just able to have the total purchase for each customer
SELECT c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName,
SUM(p.Quantity*p.UnitPrice) AS TotalAmount
FROM
Purchases AS p
join Customers AS c ON c.CustomerID = p.CustomerID
join Categories AS cat ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID, c.CustomerName,cat.CustomerCategoryName
ORDER BY TotalAmount DESC
The result set return a row for each CustomerID
CustomerID | CustomerName | CustomerCategoryName | TotalAmount |
---|---|---|---|
905 | Sara Huiting | Supermarket | 24093.60 |
155 | Tailspin Toys | Novelty Shop | 23579.50 |
473 | Hilton | Hotel | 23125.60 |
143 | Jane Doe | Journalist | 21915.50 |
518 | Wingtip Toys | Novelty Shop | 20362.40 |
489 | Jason Black | Supermarket | 20226.40 |
... | ... | ... | ... |
I have 6 categories:
- Hotel
- Journalist
- Novelty Shop
- Supermarket
- Computer Store
- Gift Store
I would like the highest "TotalAmount" for each "CustomerCategoryName", so that only 6 records are returned (instead of 500).
CustomerID | CustomerName | CustomerCategoryName | TotalAmount |
---|---|---|---|
905 | Sara Huiting | Supermarket | 24093.60 |
155 | Tailspin Toys | Novelty Shop | 23579.50 |
473 | Hilton | Hotel | 23125.60 |
143 | Jane Doe | Journalist | 21915.50 |
1018 | Nils Kaulins | Computer Store | 17019.00 |
866 | Jay Bhuiyan | Gift Store | 14251.50 |
How to improve my query to get this output?
CodePudding user response:
You can use TOP 1 WITH TIES
in combination with an ORDER BY
clause on a ROW_NUMBER
window function, that will assign ranking = 1 to all the highest "TotalAmount" values for each "CustomerCategoryName".
SELECT TOP 1 WITH TIES
c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName,
SUM(p.Quantity*p.UnitPrice) AS TotalAmount
FROM Purchases p
JOIN Customers c ON c.CustomerID = p.CustomerID
JOIN Categories cat ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName
ORDER BY ROW_NUMBER() OVER(PARTITION BY cat.CustomerCategoryName
ORDER BY SUM(p.Quantity*p.UnitPrice) DESC)
CodePudding user response:
If you want to do this with just subqueries, and not with a CTE, you can do the following process:
- Innermost query - Get all row values
- Second query - Assign a row number for each row, partitioned by the CustomerCategoryName and ordered by TotalAmount
- Final query only has where the RowRank is 1
You can probably optimize the innermost subquery by putting the RowRank in it, but without access to the table, I'm not entirely sure if the query plan will be any more efficient.
/*
Get final values where the RowRank = 1
*/
SELECT DISTINCT
final.CustomerID,
final.CustomerName,
final.CustomerCategoryName,
final.TotalAmount
FROM (
/*
Get the individual row rankings by TotalAmount DESC
*/
SELECT DISTINCT
data.CustomerID,
data.CustomerName,
data.CustomerCategoryName,
data.TotalAmount,
ROW_NUMBER() OVER (PARTITION BY data.CustomerCategoryName ORDER BY data.TotalAmount DESC) AS RowRank
FROM (
/*
Get all row values
*/
SELECT
c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName,
SUM(p.Quantity * p.UnitPrice) AS TotalAmount
FROM Purchases AS p
JOIN Customers AS c
ON c.CustomerID = p.CustomerID
JOIN Categories AS cat
ON c.CustomerCategoryID = cat.CustomerCategoryID
GROUP BY c.CustomerID,
c.CustomerName,
cat.CustomerCategoryName) data) final
WHERE final.RowRank = 1
ORDER BY final.TotalAmount DESC