I have a table with employees that looks like this:
Name | Department | Manager | Date |
---|---|---|---|
Employee 1 | Dept 1 | Manager X | 202101 |
Employee 1 | Dept 1 | Manager X | 202102 |
Employee 1 | Dept 2 | Manager X | 202103 |
Employee 1 | Dept 2 | Manager X | 202104 |
Employee 1 | Dept 1 | Manager X | 202105 |
Employee 1 | Dept 1 | Manager X | 202106 |
Employee 2 | Dept 1 | Manager X | 202101 |
Employee 2 | Dept 1 | Manager X | 202102 |
I need to build a view that shows the data in the following format:
Name | Department | Manager | Valid_From | Valid_To |
---|---|---|---|---|
Employee 1 | Dept 1 | Manager X | 202101 | 202102 |
Employee 1 | Dept 2 | Manager X | 202103 | 202104 |
Employee 1 | Dept 1 | Manager X | 202105 | 999912 |
Employee 2 | Dept 1 | Manager X | 202101 | 999912 |
So far, this is what why code looks like:
WITH cte AS
(
SELECT [Name], Department, Manager, Valid_From = min([Date]), Valid_To = max([Date]),
RowNum = ROW_NUMBER() OVER (PARTITION BY [Name], ORDER BY max([Date]) DESC)
FROM TestingTable
WHERE ([Date] IS NOT NULL)
GROUP BY [Name], Department, Manager
)
SELECT [Name], Department, Manager, Valid_From,
CASE WHEN RowNum = 1 THEN 999912 ELSE Valid_To END AS Valid_To, CASE WHEN RowNum = 1 THEN 1 ELSE 0 END AS Is_Latest
FROM cte
The output is this - it groups the intervals that employee 1 has worked in department 1, while I need it in 2 different chronological intervals.
Name | Department | Manager | Valid_From | Valid_To |
---|---|---|---|---|
Employee 1 | Dept 1 | Manager X | 202101 | 999912 |
Employee 1 | Dept 2 | Manager X | 202103 | 202104 |
Employee 2 | Dept 1 | Manager X | 202101 | 999912 |
I experimented a bit with the lag and lead functions to compare the dates, but I'm lost.
CodePudding user response:
Looks like you need to group consecutive year-months per employee-department-manager together. It could be done like so:
with cte1 as (
select name
, department
, manager
, datefromparts(date / 100, date % 100, 1) as yymm
from t
), cte2 as (
select *
, case when lag(yymm) over (partition by name, department, manager order by yymm) = dateadd(month, -1, yymm) then 0 else 1 end as new_grp
from cte1
), cte3 as (
select *
, sum(new_grp) over (partition by name, department, manager order by yymm) as grp_num
from cte2
)
select name
, department
, manager
, min(yymm) as valid_from
, max(yymm) as valid_to
from cte3
group by name, department, manager, grp_num
order by name, valid_from, department, manager
Note that I had to convert the year-months to dates for easier comparison. Result:
name | department | manager | valid_from | valid_to |
---|---|---|---|---|
Employee 1 | Dept 1 | Manager X | 2021-01-01 | 2021-02-01 |
Employee 1 | Dept 2 | Manager X | 2021-03-01 | 2021-04-01 |
Employee 1 | Dept 1 | Manager X | 2021-05-01 | 2021-06-01 |
Employee 2 | Dept 1 | Manager X | 2021-01-01 | 2021-02-01 |
Replacing the last valid_to
for each employee with 9999-12-01
is trivial e.g. you can check if lead(valid_from) over (partition by name order by valid_from)
is null.