I have a table candidates
in which there are three columns id
, position
, and salary
.
The code to generate the table is
create table candidates (
id int primary key,
position varchar not null,
salary int not null
);
insert into candidates values (1, 'junior', 10500);
insert into candidates values (2, 'senior', 15000);
insert into candidates values (3, 'senior', 35000);
insert into candidates values (4, 'junior', 8000);
insert into candidates values (5, 'senior', 30000);
insert into candidates values (6, 'senior', 25000);
insert into candidates values (7, 'junior', 30000);
insert into candidates values (8, 'senior', 50000);
insert into candidates values (9, 'senior', 30000);
insert into candidates values (10, 'junior', 7000);
insert into candidates values (11, 'junior', 8000);
insert into candidates values (12, 'senior', 33000);
insert into candidates values (13, 'junior', 5000);
insert into candidates values (14, 'senior', 47000);
insert into candidates values (15, 'junior', 12000);
I have a budget of 150000 and I need to hire as many low-cost seniors as possible, and with the remaining money, I can hire as many juniors as possible. I wrote this code to get the cumulative sum of money for seniors and juniors.
SELECT id, position, salary, SUM(salary) OVER (PARTITION BY position ORDER BY salary) AS cum_salary
FROM candidates
ORDER BY position DESC, salary;
The output I get is
id | position | salary | cum_salary |
---|---|---|---|
2 | senior | 15000 | 15000 |
6 | senior | 25000 | 40000 |
5 | senior | 30000 | 100000 |
9 | senior | 30000 | 100000 |
12 | senior | 33000 | 133000 |
3 | senior | 35000 | 168000 |
14 | senior | 47000 | 215000 |
8 | senior | 50000 | 265000 |
13 | junior | 5000 | 5000 |
10 | junior | 7000 | 12000 |
11 | junior | 8000 | 28000 |
4 | junior | 8000 | 28000 |
1 | junior | 10500 | 38500 |
15 | junior | 12000 | 50500 |
7 | junior | 30000 | 80500 |
I can see that I can use 133000 (which is less than 150000) to hire 5 seniors and use the rest of the money (150000 - 133000 = 17000) to hire two juniors. So, the final output should be something like
senior | 5 |
---|---|
junior | 2 |
How can I write this query using CTE and Window Function in POSTGRESQL 9.6 to get the kind of output I need for larger datasets, where it will not always be possible to do it manually?
PS: I am not an expert user of Postgres 9.6.
CodePudding user response:
This would do it:
WITH cte AS (
SELECT position, salary
, sum(salary) OVER w AS cum_salary
, row_number() OVER w AS count
FROM candidates
WINDOW w AS (PARTITION BY position ORDER BY salary)
)
, sen AS (
SELECT position, cum_salary, count
FROM cte
WHERE position = 'senior'
AND cum_salary <= 150000
ORDER BY cum_salary DESC
LIMIT 1
)
SELECT position, count FROM sen
UNION ALL
( -- parentheses required
SELECT position, count
FROM cte
WHERE position = 'junior'
AND cum_salary <= (SELECT 150000 - cum_salary FROM sen)
ORDER BY cum_salary DESC
LIMIT 1
);
db<>fiddle here
CodePudding user response:
As long as you can hardcode the amount the CTE would look like the below. PS: You had a good start. There also may be a way to optimize the code below. Good luck ;)
with seniors as (
SELECT
id
, position
, salary
, SUM(salary) OVER (PARTITION BY position ORDER BY salary) AS cum_salary
FROM candidates
where position = 'senior'
)
,juniors as (
SELECT
id
, position
, salary
, SUM(salary) OVER (PARTITION BY position ORDER BY salary) AS cum_salary
FROM candidates
where position = 'junior'
)
,seniors_canhire as (
select
id
, position
, salary
from seniors
where cum_salary <= 150000
)
,juniors_canhire as (
SELECT
id
, position
, salary
FROM juniors
where cum_salary <= (select 150000-sum(salary) from seniors_canhire)
)
select *
from seniors_canhire
union all
select *
from juniors_canhire