SQL exclude rows containing nearly duplicate values filtered by grouped max values-CodePudding

I have a table like this...

ID	User_ID	Skill	Skill_Level	Skill_ID
1	1	Project Manager	3	1
2	4	Teacher	6	2
3	2	Teacher	5	2
4	3	Administrator	2	3
5	3	Project Manager	3	1
6	1	Project Manager	4	1
7	4	Deputy Head	5	4
8	2	Teacher	6	2

For regulatory reasons we cannot delete old (historic) rows. I need to run a query that just returns the set of "User_ID" "Skill" and "Best Skill-Level" (i.e. max value per person). The successful query should look like this... So, some people can have more than one skill (User_ID 3 and 4 in this example) and some people become more skilled in their skill (User_ID 1 and 2).

ID	User_ID	Skill	Skill_Level	Skill_ID
2	4	Teacher	6	2
4	3	Administrator	2	3
5	3	Project Manager	3	1
6	1	Project Manager	4	1
7	4	Deputy Head	5	4
8	2	Teacher	6	2

Have you a suggestion about the most efficient and shortest query code that can achieve this? There is a table [dbo].[skills] with PK skill_ID_PK a table [dbo].[people] with PK user_ID_PK and a table [dbo].[skillHistory] with FK skill_ID_PK and FK user_ID_PK.

Any help most welcomed.

CodePudding user response：

Using a SQL Generator for Snowflake I found that using QUALIFY was the recommendation, but I don't think that works in MSSQL.

To do this without QUALIFY, you might try something like this:

MSSQL

SELECT *
FROM (
    SELECT *,
    ROW_NUMBER() OVER (PARTITION BY User_ID, Skill ORDER BY Skill_Level DESC) AS RN
    FROM table
    ) AS T
WHERE T.RN = 1;

For RDBMS that support QUALIFY,

SELECT 
  * 
FROM 
  table QUALIFY ROW_NUMBER() OVER (
    PARTITION BY User_ID, Skill 
    ORDER BY Skill_Level DESC
  ) = 1