The following image is the ER diagram of the database:
My task is to create a report that includes the following:
- the storeID
- the store name
- the number of unique players that have purchased a badge from the store
- the number of unique players that have not purchased a badge from the store
- the total money spent at the store
- the most expensive badge a player has purchased at the store
- the cheapest badge a player has purchased at the store
- the average price of the items that have been purchased at the store.
But when I am trying to execute the following SQL command, I am getting an error saying: Error Code 1111. Invalid use of group function.
use treasurehunters;
select rpt_category.storeId,
rpt_category.storeName,
total_purchased_user,
non_purchased_player,
total_spent,
expensive_badge,
cheapest_badge
avg_spent
from
(select badgename as expensive_badge
from badge
inner join purchase
where cost = max(cost)) rpt_data_2
inner join
(select badgename as cheapest_badge
from badge
inner join purchase
where cost = min(cost)) rpt_data_3
inner join
(select distinct count(username) as total_purchased_user,
storeid,
storename,
sum(cost) as total_spent,
average(cost) as avg_spent
from player
inner join purchase
inner join store
inner join badge
on store.storeID = purchase.storeID and
purchase.username= player.username and
purchase.badgeID = badge.badgeId) rpt_category
inner join
(select count (username) as non_purchased_player,
storeid
from player
inner join purchase
on purchase.storeid != store.storeid and
player.userername= purchase.uername ) rpt_data_1;
Now, what can I do to get rid of that error.
CodePudding user response:
The cause of your error is likely that you're implying a store-level grouping without explicitly grouping on that column with a GROUP BY
clause. Therefore, you're attempting to extract aggregate results that are impossible at the table-level.
You can probably resolve this by adding GROUP BY store.storeID
in each of your subqueries. However, there's a lot more wrong with this query that makes it unfavorable to attempt to diagnose and resolve it.
This is all doable in a single query / grouping. Here's what your query should look like:
SELECT
store.storeID,
MAX(store.storeName) AS storeName,
COUNT(DISTINCT purchase.username) AS total_purchased_user,
MAX(player_count.players) - COUNT(DISTINCT purchase.username) AS non_purchased_user,
SUM(purchase.cost) AS total_spent,
AVG(purchase.cost) AS avg_spent,
SUBSTRING(MIN(CONCAT(LPAD(purchase.cost, 11, '0'), badge.badgeName)), 12) AS cheapest_badge,
SUBSTRING(MAX(CONCAT(LPAD(purchase.cost, 11, '0'), badge.badgeName)), 12) AS expensive_badge
FROM store
LEFT JOIN purchase ON store.storeID = purchase.storeID
LEFT JOIN badge ON purchase.badgeID = badge.badgeId
CROSS JOIN (SELECT COUNT(*) AS players FROM player) AS player_count
GROUP BY store.storeID;
What's happening here (working bottom-up):
GROUP BY
store to ensure the results are aggregated by that, and all other metrics are calculatedFROM store / LEFT JOIN all other tables
ensures we get metrics from every store, whether or not there are purchases for itCROSS JOIN (SELECT COUNT(*) FROM players)
this is a hack to give us a running total of all players that we can reference against store player-purchase counts to get the "didn't purchase" count simply and quickly, without any additional joinsCOUNT(DISTINCT purchase.username)
ensures that user counts are referenced from purchases. This also means we don't have to join on the players table in this main portion of the query to get purchase counts.SUM / AVERAGE
work like you had themSUBSTRING(MIN(CONCAT...
these calculations are using Scalar-Aggregate Reduction, a technique I invented to prevent the need for self-joining a query to get associated min/max values. There's more on this technique here: SQL Query to get column values that correspond with MAX value of another column?
Cheers!