I need to figure out what beers from table 'Beer' were not ordered by top 3 buyers from table 'Buyers'. Beer.BeerId is foreign key in Buyers.BeerId. Other important columns in 'Buyers' are: BuyId, PubId, StoreId and Quantity
dbo.Beer
BeerId |
---|
1 |
2 |
3 |
4 |
5 |
dbo.Buyers
BuyId | PubId | StoreId | BeerId | Quantity |
---|---|---|---|---|
1 | 1 | NULL | 1 | 30 |
2 | NULL | 1 | 2 | 40 |
3 | 2 | NULL | 3 | 50 |
4 | NULL | 2 | 4 | 10 |
I tried doing this query but it gives me no results.
SELECT Be.BeerId
FROM Beer be
left outer join Buyer bu
ON be.BeerId=bu.BeerId
WHERE not exists (
SELECT TOP(3) BuyId, bu.BeerId,
SUM(Quantity) as TotalOrdered
FROM Buyer bu
GROUP BY BuyId, bu.BeerId
ORDER BY SUM(Quantity) DESC)
What I would expect to see is that from the top 3 results, beers that are not ordered are BeerId=4 and BeerId=5
CodePudding user response:
This seems like a classic window function problem. You can use ROW_NUMBER()
to enumerate the rows in order of their Quantity
totals descending, and filter out the top 3 like so:
WITH _BuyersTotalsRanked AS
(
SELECT
BeerId, -- Same BeerId might've been ordered by different Buyers
ROW_NUMBER() OVER (ORDER BY TotalOrderedPerBeer DESC) AS BuyersRanked
FROM
(
SELECT
BeerId,
SUM(Quantity) AS TotalOrderedPerBeer
FROM dbo.Buyer
GROUP BY BuyId, BeerId
) AS BuyersTotals
)
SELECT DISTINCT BeerId -- Same BeerId might've been ordered by different Buyers
FROM _BuyersTotalsRanked
WHERE BuyersRanked > 3 -- Filter out the top 3 Buyers Beers
Note that if there's a tie for the top 3, this will return nondeterminsitic (semi-random) results because of the ORDER BY
clause in the ROW_NUMBER()
function not being unique (same problem with your ORDER BY
clause in your example query).
To correct that, you would need to add a uniquifying field to deterministically decide which row wins in a tie. Usually you would use the key of the table. I'm not sure what the key is for your dbo.Buyer
table. But whatever it is, or whichever combination of fields are unique, you can update the ROW_NUMBER()
clause like so ROW_NUMBER() OVER (ORDER BY TotalOrderedPerBeer DESC, UnqiueSetOfFields)
. Then those fields would be used to determine the winning row(s) in a tie.
Alternatively, if when there are ties for top 3, and you want to exclude all of those ties as well, you can use a different window function such as DENSE_RANK()
like so:
WITH _BuyersTotalsRanked AS
(
SELECT
BeerId,
DENSE_RANK() OVER (ORDER BY TotalOrderedPerBeer DESC) AS BuyersRanked
FROM
(
SELECT
BeerId,
SUM(Quantity) AS TotalOrderedPerBeer
FROM dbo.Buyer
GROUP BY BuyId, BeerId
) AS BuyersTotals
)
SELECT DISTINCT BeerId -- Same BeerId might've been ordered by different Buyers
FROM _BuyersTotalsRanked
WHERE BuyersRanked > 3 -- Filter out the top 3 Buyers Beers
DENSE_RANK()
will enumerate all tied rows with the same value, instead of randomly deciding which one comes first like ROW_NUMBER()
does.
Finally, your question (and data / schema) is a little unclear. Based on your example code, I'm interpreting that what you're asking for is to exclude the top 3 Buyer
s of any single Beer
, not top 3 by their told purchases of all Beer
s. E.g. Buyer1
may have only purchased one type of beer, BeerA
for a quantity of 50, but Buyer2
purchased two types of beer, BeerB
for 30, and BeerC
for 40. In the above code (and your attempted code) Buyer1
would be ranked #1 even though Buyer2
has a greater total purchase of Beer
s, totaling 70.
If you wanted to rank the Buyer
s by their total purchases across all Beer
s, then you'd want to do something like the following:
WITH _BuyersTotalsRanked AS
(
SELECT
BuyId,
DENSE_RANK() OVER (ORDER BY TotalOrderedAll DESC) AS BuyersRanked
FROM
(
SELECT
BuyId,
SUM(Quantity) AS TotalOrderedAll
FROM dbo.Buyer
GROUP BY BuyId
) AS BuyersTotals
)
SELECT DISTINCT B.BeerId -- Same BeerId might've been ordered by different Buyers
FROM dbo.Buyer AS B
INNER JOIN _BuyersTotalsRanked AS BTR
ON B.BuyId = BTR.BuyId
WHERE B.BuyersRanked > 3 -- Filter out the top 3 Buyers Beers
CodePudding user response:
select top (3)BeerId from ( SELECT BeerId, SUM(Quantity) Be.BeerId FROM Beer be left outer join Buyer bu ON be.BeerId=bu.BeerId Group by BeerId) my_derived_table ORDER BY BeerId