I have the following simple table (Table1), where each row is a student_ID and their name, and each student has one or multiple wins (Wins). I would like to output: Student_ID, Student_name, count of Wins, sorted by count of Wins (descending) and then Student_ID (ascending), excluding those students who have the same count of Wins which is less than the max of the Wins (i.e.5). In other words, Lizzy and Mark have the same count of wins, and 3 is lower than 5, so the output will exclude the two students, Lizzy and Mark.
From comments: "Betty, David and Cathy should be excluded", also.
Table1:
student_id | student_name | wins |
---|---|---|
1 | John | YES |
1 | John | YES |
1 | John | YES |
1 | John | YES |
1 | John | YES |
2 | Brandon | YES |
2 | Brandon | YES |
2 | Brandon | YES |
2 | Brandon | YES |
2 | Brandon | YES |
3 | Lizzy | YES |
3 | Lizzy | YES |
3 | Lizzy | YES |
4 | Mark | YES |
4 | Mark | YES |
4 | Mark | YES |
5 | Betty | YES |
6 | David | YES |
7 | Cathy | YES |
8 | Joe | YES |
8 | Joe | YES |
Desired output:
student_id | student_name | cnt_wins |
---|---|---|
1 | John | 5 |
2 | Brandon | 5 |
8 | Joe | 2 |
Here is my SQL in Oracle. I can't figure out what went wrong. The log says "(SELECT b.cnt_wins, count(b.student_id) has too many values".
WITH st_cte AS
(SELECT student_id, student_name, count(wins) cnt_wins
FROM Table1
GROUP BY student_id, student_name
ORDER BY count(wins) DESC, student_id)
SELECT *
FROM st_cte a
WHERE a.cnt_wins not in
(SELECT b.cnt_wins, count(b.student_id)
FROM st_cte b
WHERE b.cnt_wins <
(SELECT max(c.cnt_wins) FROM st_cte c)
GROUP BY b.cnt_wins
HAVING count(b.student_id) > 1);
CodePudding user response:
There are too many values selected inside the 'in' select:
WHERE a.cnt_wins -- 1 value
not in
(SELECT b.cnt_wins, count(b.student_id) -- 2 values
FROM st_cte b
you shoud either do :
WHERE a.cnt_wins not in
(SELECT b.cnt_wins
FROM st_cte ...
or
WHERE (a.cnt_wins, count(something)) not in
(SELECT b.cnt_wins, count(b.student_id)
FROM st_cte ...
CodePudding user response:
Updated based on updated requirements...
The requirement was ambiguous in that Betty, David, and Cathy seem to also meet the criteria to be removed from the result. This requirement was clarified and those rows should have been removed.
Logic has been added to allow only all max_cnt
rows, plus any students with a unique count.
Also note that if wins
can be any other non-null value, COUNT(wins)
is not correct.
Given all that, maybe something like this is a starting point:
WITH cte AS (
SELECT student_id, student_name
, COUNT(wins) cnt_wins
, MAX(COUNT(wins)) OVER () AS max_cnt
FROM Table1
GROUP BY student_id, student_name
)
, cte2 AS (
SELECT cte.*
, COUNT(*) OVER (PARTITION BY cnt_wins) AS cnt_students
FROM cte
)
SELECT student_id, student_name, cnt_wins
FROM cte2
WHERE max_cnt = cnt_wins
OR cnt_students = 1
ORDER BY cnt_wins DESC, student_id
;
and to handle wins
that can be other non-null values:
WITH cte AS (
SELECT student_id, student_name
, COUNT(CASE WHEN wins = 'YES' THEN 1 END) cnt_wins
, MAX(COUNT(CASE WHEN wins = 'YES' THEN 1 END)) OVER () AS max_cnt
FROM Table1
GROUP BY student_id, student_name
)
, cte2 AS (
SELECT cte.*
, COUNT(*) OVER (PARTITION BY cnt_wins) AS cnt_students
FROM cte
)
SELECT student_id, student_name, cnt_wins
FROM cte2
WHERE max_cnt = cnt_wins
OR cnt_students = 1
ORDER BY cnt_wins DESC, student_id
;
Result (with data to test the new requirement, one student (Joe) with unique counts (2)):
STUDENT_ID | STUDENT_NAME | CNT_WINS |
---|---|---|
1 | John | 5 |
2 | Brandon | 5 |
8 | Joe | 2 |
Setup:
CREATE TABLE table1 (
Student_ID int
, Student_Name VARCHAR2(20)
, Wins VARCHAR2(10)
);
BEGIN
-- Assume only wins are stored.
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 1, 'John', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 2, 'Brandon', 'YES');
INSERT INTO table1 VALUES ( 3, 'Lizzy', 'YES');
INSERT INTO table1 VALUES ( 3, 'Lizzy', 'YES');
INSERT INTO table1 VALUES ( 3, 'Lizzy', 'YES');
INSERT INTO table1 VALUES ( 4, 'Mark', 'YES');
INSERT INTO table1 VALUES ( 4, 'Mark', 'YES');
INSERT INTO table1 VALUES ( 4, 'Mark', 'YES');
INSERT INTO table1 VALUES ( 5, 'Betty', 'YES');
INSERT INTO table1 VALUES ( 6, 'David', 'YES');
INSERT INTO table1 VALUES ( 7, 'Cathy', 'YES');
INSERT INTO table1 VALUES ( 8, 'Joe', 'YES');
INSERT INTO table1 VALUES ( 8, 'Joe', 'YES');
END;
/
Correction to the original query in the question:
WITH st_cte AS
(SELECT student_id, student_name, count(wins) cnt_wins
FROM Table1
GROUP BY student_id, student_name
ORDER BY count(wins) DESC, student_id
)
SELECT *
FROM st_cte a
WHERE a.cnt_wins not in
(SELECT b.cnt_wins
FROM st_cte b
WHERE b.cnt_wins < (SELECT max(c.cnt_wins) FROM st_cte c)
GROUP BY b.cnt_wins
HAVING count(b.student_id) > 1
)
;