Say I have a table:
date | fruit | status | numberOfFruits |
---|---|---|---|
2022-01 | apple | ripe | 3 |
2022-01 | banana | mature | 5 |
2022-01 | pear | ripe | 10 |
2022-01 | grapes | mature | 9 |
2022-02 | apple | ripe | 3 |
2022-02 | banana | mature | 3 |
2022-02 | pear | ripe | 3 |
2022-02 | grapes | mature | 7 |
I want to be able to create a query that adds 3 aggregate columns (AVG, MIN, MAX) that aggregates the column numberOfFruits
based on their status
and date
, ordered by fruit
and date
. The output of the table should be:
date | fruit | status | numberOfFruits | AvgNumOfFruits | MaxNumOfFruits | MinNumOfFruits |
---|---|---|---|---|---|---|
2022-01 | apple | ripe | 3 | 6.5 | 10 | 3 |
2022-01 | pear | ripe | 10 | 6.5 | 10 | 3 |
2022-01 | banana | mature | 5 | 7 | 9 | 5 |
2022-01 | grapes | mature | 9 | 7 | 9 | 5 |
2022-02 | apple | ripe | 3 | 3 | 3 | 3 |
2022-02 | pear | ripe | 3 | 3 | 3 | 3 |
2022-02 | banana | mature | 3 | 5 | 7 | 3 |
2022-02 | grapes | mature | 7 | 5 | 7 | 3 |
I'm at a loss and here's what I have so far:
SELECT
date,
fruit,
status,
numberOfFruits,
AVG(CASE WHEN "status" = 'ripe' THEN "numberOfFruits" ELSE "numberOfFruits" END) as AvgNumOfFruits,
MIN(CASE WHEN "status" = 'ripe' THEN "numberOfFruits" ELSE "numberOfFruits" END) as MingNumOfFruits,
MAX(CASE WHEN "status" = 'ripe' THEN "numberOfFruits" ELSE "numberOfFruits" END) as MaxNumOfFruits
FROM fruitdata
GROUP BY 1, 2, 3, 4
ORDER BY date, status
The query just repeats the numberofFruits
for those 3 aggregate columns. It should be like "take the average of the column numberOfFruits
that has the status
of ripe, as well as the date
2022-01, and put it in a new column called AvgNumOfFruits
. Somehow I can't translate this to SQL.
Any advice/tips/help would be appreciated. Thanks in advance!
CodePudding user response:
Aggregation functions are employed to aggregate rows (combining rows to get aggregated values). In your case you should rather use window functions, that compute values over windows (partitions/groups of rows), though without aggregating the rows.
SELECT *,
AVG(numberOfFruits) OVER(PARTITION BY date, status) AS AvgNumOfFruits,
MAX(numberOfFruits) OVER(PARTITION BY date, status) AS MaxNumOfFruits,
MIN(numberOfFruits) OVER(PARTITION BY date, status) AS MinNumOfFruits
FROM fruitdata
ORDER BY date,
status DESC
The corresponding window functions need to be computed on groups of "date" and "status" fields. On the other hand, the ordering is completely optional and up to you (ORDER BY
clause).
Check the demo here.