As I know HAVING
clause is used to filter rows for each group.
I have a table that stores scores of students.
create table sc
(
`classid` int,
`studentid` int,
`score` int
);
Here is the sample data:
--------- ----------- -------
| classid | studentid | score |
--------- ----------- -------
| 1 | 1 | 50 |
| 1 | 2 | 59 |
| 1 | 3 | 80 |
| 1 | 4 | 68 |
| 1 | 5 | 70 |
| 1 | 6 | 20 |
| 1 | 7 | 90 |
| 1 | 8 | 100 |
| 1 | 9 | 25 |
| 2 | 1 | 51 |
| 2 | 2 | 59 |
| 2 | 3 | 80 |
| 2 | 4 | 68 |
| 2 | 5 | 70 |
| 2 | 6 | 30 |
| 2 | 7 | 44 |
| 2 | 8 | 80 |
| 3 | 1 | 20 |
| 1 | 11 | 30 |
| 1 | 12 | 40 |
--------- ----------- -------
And I want to query the max score of each class, so I wrote this SQL statement:
select *
from sc
group by classid
having score = max(score);
But the output is not what I expect. The output only prints one row.
--------- ----------- -------
| classid | studentid | score |
--------- ----------- -------
| 3 | 1 | 20 |
--------- ----------- -------
CodePudding user response:
If you have columns in your SELECT clause that are not being aggregated by an aggregate formula like
sum()
,max()
,avg()
, etc, then those columns also need to be present in your GROUP BY. Older versions of mysql are the only RDBMS that doesn't error when you miss this step. Instead of erroring, it will just grab whichever values it wishes and give you random results every time you run.HAVING is one of the last steps to execute in SQL (just before ORDER BY). there's some nuance there with window functions and other stuff that executes late. Because of that doing
score = max(score)
doesn't make much sense. Either score is aggregated at this point or it is not. You can't compare both aggregated state and non-aggregated state of that column at the same time.
Instead you want a correlated subquery:
SELECT *
FROM t1 as dt
WHERE score =
(
SELECT MAX(score)
FROM t1 as dt2
WHERE dt.classid = dt2.classid
);
Alternatively you can use window functions:
SELECT *
FROM
(
SELECT classid,
studentid,
score,
MAX(score) OVER (PARTITION BY classid) as maxclassscore
FROM t1
) dt
WHERE score = maxclassscore;
CodePudding user response:
Your code is not valid, but with your actual data you get only one result as only one has 100
SELECT * FROM sc WHERE score = (select max(score) maxscore FROM sc);
classid | studentid | score ------: | --------: | ----: 1 | 8 | 100
db<>fiddle here