I have this table:
CREATE TABLE applicants
(
id INT PRIMARY KEY IDENTITY,
[name] varchar(255),
[age] int,
[address] varchar(255),
[programming language] varchar(255),
[cognitive score] int
)
And I would need the top 2 applicants based on cognitive score from each programming language.
I should do it by subqueries or joins or something similar, but not with ROW_NUMBER()
.
I need only 3 columns in the result: name, cognitive score, programming language
I am aware of ORDER BY
vs only 3 columns issue.
It seems so easy but I have been struggling a lot. I would really appreciate some help
CodePudding user response:
Not sure why you don't want to / can't use row_number() but here's some code that works using CROSS APPLY.
First I created a table with your specification and my dummy data:
DROP TABLE IF EXISTS #applicants;
CREATE TABLE #applicants (
id INT PRIMARY KEY IDENTITY,
[name] VARCHAR(255),
[age] INT,
[address] VARCHAR(255),
[programming language] VARCHAR(255),
[cognitive score] INT
);
INSERT INTO #applicants (name, age, address, [programming language], [cognitive score])
VALUES ('a', 20, 'address1', 'SQL', 70),
('b', 31, 'address2', 'SQL', 80),
('c', 32, 'address3', 'SQL', 90),
('d', 33, 'address4', 'C', 71),
('e', 34, 'address5', 'C', 81),
('f', 35, 'address6', 'C', 91),
('g', 36, 'address7', 'C#', 72),
('h', 37, 'address8', 'C#', 82),
('i', 38, 'address9', 'C#', 92);
Then this code gets your actual output.
SELECT A.id,
A.name,
A.age,
A.id,
A.[programming language],
A.[cognitive score]
FROM #applicants AS A
CROSS APPLY (
SELECT TOP 2 t.id,
t.[cognitive score]
FROM #applicants AS t
WHERE A.[programming language] = t.[programming language]
ORDER BY t.[cognitive score] DESC
) AS A2
WHERE A.id = A2.id;
The initial SELECT * FROM Applicants
would return absolutely everything. The CROSS APPLY
works by looking for the TOP 2 based on Cognitive Score
whilst matching on Programming Language
.
Finally they join back using ID
which forces the CROSS APPLY
to act more like an inner join and only return the rows where the IDs match. Without it you'd end up with 18 rows because of the 9 rows each repeating twice. If this explanation is unclear take a look at this version of the query to see the duplication:
SELECT A.id,
A.name,
A.age,
A.id,
A.[programming language],
A.[cognitive score],
A2.ID,
A2.[cognitive score]
FROM #applicants AS A
CROSS APPLY (
SELECT TOP 2 t.id,
t.[cognitive score]
FROM #applicants AS t
WHERE A.[programming language] = t.[programming language]
ORDER BY t.[cognitive score] DESC
) AS A2