Here's my sample input tables:
employee_id | project | effective_date** |
---|---|---|
1 | A | 2014-08-13 |
1 | B | 2016-12-21 |
1 | C | 2018-02-21 |
employee_id | designation | effective_date |
---|---|---|
1 | trainee | 2014-08-05 |
1 | senior | 2016-08-17 |
1 | team leader | 2018-02-05 |
Table1: describes an employee who undergoes different projects at different date's in an organization.
Table2: describes the same employee from Table1 who undergoes different designation in the same organisation.
Now I want an Expected output table like this:
employee_id | project | designation | effective_date |
---|---|---|---|
1 | A | trainee | 2014-08-13 |
1 | A | senior | 2016-08-17 |
1 | B | Senior | 2016-12-21 |
1 | B | team leader | 2018-02-05 |
1 | C | team leader | 2018-02-21 |
The fact is that whenever:
- his project changes, I need to display project effective_date.
- his designation changes, I need to display designation effective_date but with the project he worked on during this designation change
CodePudding user response:
Let's do this way
WITH subquery AS (
SELECT employee_id, project, effective_date as project_effective_date
FROM Table1
UNION
SELECT employee_id, project, effective_date as project_effective_date
FROM Table1
JOIN (
SELECT employee_id, MAX(effective_date) as max_date
FROM Table1
GROUP BY employee_id
) max_dates
ON Table1.employee_id = max_dates.employee_id AND Table1.effective_date = max_dates.max_date
)
SELECT subquery.employee_id, subquery.project, Table2.designation, subquery.project_effective_date
FROM subquery
LEFT JOIN Table2
ON subquery.employee_id = Table2.employee_id AND subquery.project_effective_date >= Table2.effective_date
ORDER BY subquery.employee_id, subquery.project_effective_date, Table2.effective_date
CodePudding user response:
This problem falls into the gaps-and-islands taxonomy. This specific variant can be solved in three steps:
- applying a
UNION ALL
of the two tables while splitting "tab1.project" and "tab2.role" in two separate fields within the same schema - compute the partitions, between a non-null value and following null values, with two running sums (one for the "designation" and one for "project")
- apply two different aggregations on the two different fields, to remove the null values.
WITH cte AS (
SELECT employee_id, effective_date,
project AS project,
NULL AS role FROM tab1
UNION ALL
SELECT employee_id, effective_date,
NULL AS project,
designation AS role FROM tab2
), cte2 AS (
SELECT *,
COUNT(CASE WHEN project IS NOT NULL THEN 1 END) OVER(
PARTITION BY employee_id
ORDER BY effective_date
) AS project_partition,
COUNT(CASE WHEN role IS NOT NULL THEN 1 END) OVER(
PARTITION BY employee_id
ORDER BY effective_date
) AS role_partition
FROM cte
)
SELECT employee_id, effective_date,
MAX(project) OVER(PARTITION BY project_partition) AS project,
MAX(role) OVER(PARTITION BY role_partition) AS role
FROM cte2
ORDER BY employee_id, effective_date
Check the demo here.
CodePudding user response:
This answer is turned out to be very similar to @Lemon's answer. But still I will not delete, at the discretion of @Lemon. I think that is usefull. First, union all dates for Employee (designation and project changes). We have null values in table.
employee_id | designation | project | effective_date |
---|---|---|---|
1 | trainee | null | 2014-08-05 |
1 | null | A | 2014-08-13 |
1 | senior | null | 2016-08-17 |
1 | null | B | 2016-12-21 |
1 | team leader | null | 2018-02-05 |
1 | null | C | 2018-02-21 |
With method, discussed here, null values will be replaced properly in 2 partitions (project and designation).
with PD as(select employee_id, designation, null as project,effective_date from empDes
union all
select employee_id, null designation, project as project,effective_date from empProj
)
,PDranged as(
select *
,sum(case when project is null then 0 else 1 end)
over(partition by employee_id order by effective_date) p_rn
,sum(case when designation is null then 0 else 1 end)
over(partition by employee_id order by effective_date) d_rn
from PD
)
,Res as(
select *
,first_value(project)
over(partition by employee_id,p_rn order by effective_date) p_project
,first_value(designation)
over(partition by employee_id,d_rn order by effective_date) p_designation
from PDranged
)
select employee_id,p_project as project,p_designation as designation,effective_date
from Res
order by effective_date
employee_id | project | designation | effective_date |
---|---|---|---|
1 | null | trainee | 2014-08-05 |
1 | A | trainee | 2014-08-13 |
1 | A | senior | 2016-08-17 |
1 | B | senior | 2016-12-21 |
1 | B | team leader | 2018-02-05 |
1 | C | team leader | 2018-02-21 |
Demo here