How does these 2 sql queries work - to find two minimum and maximum salaries-CodePudding

Can anyone explain how does this two queries work ?

Q) Write a query to retrieve two minimum and maximum salaries from the EmployeePosition table. To retrieve two minimum salaries, you can write a query as below:

A)To retrieve two minimum salaries, you can write a query as below:

SELECT DISTINCT Salary 
FROM EmployeePosition E1 
WHERE 2 >= (SELECT COUNT(DISTINCT Salary )
            FROM EmployeePosition E2 
            WHERE E1.Salary >= E2.Salary
            ) ORDER BY E1.Salary DESC;

To retrieve two maximum salaries, you can write a query as below:

SELECT DISTINCT Salary 
FROM EmployeePosition E1 
WHERE 2 >= (SELECT COUNT(DISTINCT Salary) 
            FROM EmployeePosition E2 
            WHERE E1.Salary <= E2.Salary
           ) 
ORDER BY E1.Salary DESC;

Reference table

is there any alternative SQL query to get the same result?

CodePudding user response：

The question is different from what you have asked

Q21. Write a query to find the Nth highest salary from the table without using TOP/limit keyword.

That is the second highest salary and it can be done by using row_number supported on MySQL 8.x

WITH max_salary AS
(
SELECT *,
   DENSE_RANK() OVER (ORDER BY Salary Desc) AS Rnk
FROM EmployeePosition 
)
SELECT max_salary.*
FROM max_salary
WHERE Rnk=2;

MySQL DENSE_RANK Function assigns a rank to each row within a partition or result set (in your case it is a result set) with no gaps in ranking values.

Meaning the same salary will have the same rank.

For example using the data on the linked question:

create table EmployeePosition  (
EmpID int,
EmpPosition varchar(25),
DateOfJoining date ,
Salary int );

insert into EmployeePosition values 
(1,'Manager','2022-05-01',500000),
(2,'Executive','2022-05-02',75000),
(3,'Manager','2022-05-01',90000),
(2,'Lead','2022-05-02',85000),
(1,'Executive','2022-05-01',300000),
(3,'Manager','2022-05-01',500000);



SELECT *,
       DENSE_RANK() OVER (ORDER BY Salary Desc) AS Rnk
    FROM EmployeePosition

Result:

EmpID   EmpPosition DateOfJoining   Salary  Rnk
 1        Manager    2022-05-01     500000  1
 3        Manager    2022-05-01     500000  1
 1        Executive  2022-05-01     300000  2
 3        Manager    2022-05-01     90000   3
 2        Lead       2022-05-02     85000   4
 2        Executive  2022-05-02     75000   5

As you can see each Salary is assigned a rank you have two 500000 salary with rank 1 , so the second highest value is 300000 which is filtered on the WHERE Rnk=2;.

The above main query could be written differently:

select EmpID,EmpPosition,DateOfJoining,Salary
from ( SELECT *,
              DENSE_RANK() OVER (ORDER BY Salary Desc) AS Rnk
      FROM EmployeePosition 
     ) as tbl 
WHERE Rnk=2;

https://dbfiddle.uk/Meh2AloO

Can you please explain the sql queries in the question?

Let's explain below example

SELECT DISTINCT Salary 
FROM EmployeePosition E1 
WHERE 2 >= ( SELECT COUNT(DISTINCT Salary )
            FROM EmployeePosition E2 
            WHERE E1.Salary <= E2.Salary
            ) 
ORDER BY E1.Salary DESC;

This is known as Correlated subqueries, which are the one in which inner query or subquery reference outer query. Outer query needs to be executed before inner query.

For each record processed by outer query, inner query will be executed and will return how many records has records has salary less than the current salary. If you are looking for second highest salary then your query will stop as soon as inner query will return 2.

CodePudding user response：

I think your answer would be:

By the inner selection, I mean:

(SELECT COUNT(DISTINCT Salary) 
FROM EmployeePosition E2 
WHERE E1.Salary <= E2.Salary
)

The system checks all the states which each salary is greater/less than how many salaries of the rest of the list(Salary list). In another word the system calculates it for all the rows and for each row it returns the number which says that this salary value is greater/less than how many salaries.

Imagine we have distinguish values in rows and they are sorted descending, so when it checks for the highest salary it returns the total row number as a result because it calculates that there are total row number states which the salary is less and equal to the highest salary. In the same way, For the lowest salary we will only have 1 state which the salary is equal to itself. So when the system checks for this logic:

WHERE 2 >= (SELECT COUNT(DISTINCT Salary) 
    FROM EmployeePosition E2 
    WHERE E1.Salary <= E2.Salary
    )

it looks up for the situations that the salary is greater/less and equal to 2 other salaries and by the outer selection it returns the value of salaries.

I think that is really time consuming especially when dealing with large databases. As an alternative you can use this code:

SELECT
Salary, Rank
FROM(
    SELECT
    Salary,
    Rank= ROW_NUMBER() OVER(ORDER BY(Salary) DESC)
    FROM EmployeePosition
    ) X
WHERE Rank<=2