SQL Query to find percentage difference in salary by emp id-CodePudding

I have the following table

CREATE TABLE test1 (in_date date, emp_id int , salary int);

INSERT INTO test1 (in_date, emp_id, salary)
VALUES
('01-01-2020',1,4000),
('01-01-2020',2,3000),
('01-01-2020',3,6000),
('01-01-2020',4,9000),
('02-01-2020',1,2000),
('02-01-2020',2,5000),
('02-01-2020',3,4200),
('02-01-2020',4,8500);

In_date	emp_id	salary
01-01-2020	1	4000
01-01-2020	2	3000
01-01-2020	3	6000
01-01-2020	4	9000
02-01-2020	1	2000
02-01-2020	2	5000
02-01-2020	3	4200
02-01-2020	4	8500

I would like to calculate percentage change in salary per emp_id

The output should look like this

emp_id	per_diff_salary
1	50%
2	-67%
3	30%
4	6%

any help is highly appreciated

CodePudding user response：

You need ROW_NUMBER to identify which row is the first and which the second. Then use conditional aggregation over that.

The calculation you are looking for appears to be 100 - SecondValue * 100 / FirstValue

SELECT
  t1.emp_id,
  [percent] = 100 - MIN(CASE WHEN t1.rn = 2 THEN t1.salary END) * 100.0 / MIN(CASE WHEN t1.rn = 1 THEN t1.salary END)
FROM (
    SELECT *,
      rn = ROW_NUMBER() OVER (PARTITION BY t1.emp_id ORDER BY t1.in_date)
    FROM test1 t1
) t1
GROUP BY
  t1.emp_id

db<>fiddle

If you could have more than two rows per emp_id then you need something to identify the last row

SELECT
  t1.emp_id,
  [percent] = 100 - MIN(CASE WHEN t1.NextDate IS NULL THEN t1.salary END) * 100.0 / MIN(CASE WHEN t1.rn = 1 THEN t1.salary END)
FROM (
    SELECT *,
      rn = ROW_NUMBER() OVER (PARTITION BY t1.emp_id ORDER BY t1.in_date),
      NextDate = LEAD(t1.in_date) OVER (PARTITION BY t1.emp_id ORDER BY t1.in_date)   -- will be null on the last row
    FROM test1 t1
) t1
GROUP BY
  t1.emp_id

CodePudding user response：

You may subselect the max and min date and then choose those salaries to calculate by joining by date

select distinct e.emp_id, 
( CAST((maxsal.salary-minsal.salary)*100  AS DECIMAL(10,2))/ CAST(minsal.salary AS DECIMAL(10,2)) )  as salary_increase_from_first_to_last_salary
 FROM test1 E
 join (
   select emp_id, salary FROM test1 EE where in_date = 
    (select min(in_date) FROM test1 EEE where EEE.emp_id = EE.emp_id)
  ) as minsal on minsal.emp_id = e.emp_id
 join (
   select emp_id, salary FROM test1 EE where in_date = 
    (select max(in_date) FROM test1 EEE where EEE.emp_id = EE.emp_id)
  ) as maxsal on maxsal.emp_id = e.emp_id
;

You may need to adapt the casts, or you may not need them at all.

Of course, if the table is big this may not be the best way performancewise and you may need to analise its plan before you accept it. Probably you can find a best way using other subselects or grouping, but this is the most obvious way IMHO by pure SQL.