Let's say I have a table labeled employees and it looks like so
date | employee id |
---|---|
4/1/2022 | 9 |
4/1/2022 | 8 |
3/1/2022 | 9 |
3/1/2022 | 8 |
3/1/2022 | 7 |
2/1/2022 | 9 |
2/1/2022 | 8 |
2/1/2022 | 7 |
1/1/2022 | 9 |
1/1/2022 | 8 |
1/1/2022 | 7 |
1/1/2022 | 6 |
1/1/2022 | 5 |
I'd like for my final output to look like below
date | departures |
---|---|
4/1/2022 | NULL |
3/1/2022 | 1 |
2/1/2022 | 0 |
1/1/2022 | 2 |
Essentially, if the employee is not found on the next upcoming month, it's counting it as a departure for the current month. I already am calculating this with the following query:
WITH MONTHLY_COUNT(DTE, EMP_COUNT) AS(
SELECT
[DATE],
COUNT(1) AS [COUNT]
FROM
employees A
GROUP BY
[DATE]),raw_data(EMPLOYEE, DATE) AS(
SELECT
A.EMPLOYEE,
[DATE]
FROM
employees A),RANKING(DTE, EMP_COUNT, RANKING) AS(
SELECT
A.DTE,
A.EMP_COUNT,
A.RANKING
FROM(
SELECT
TOP(6) ---pick the most recent x months
A.DTE,
A.EMP_COUNT,
RANK() OVER(ORDER BY A.DTE DESC) AS [RANK]
FROM
MONTHLY_COUNT AS A
)A(DTE, EMP_COUNT, RANKING))SELECT
A.DTE,
CASE
WHEN A.DTE = B.DTE THEN B.SEPARATIONS
WHEN A.DTE = C.DTE THEN C.SEPARATIONS
WHEN A.DTE = D.DTE THEN D.SEPARATIONS
END AS SEPARATIONS
FROM(
SELECT
TOP(6) --pick the most recent x months
A.DTE
FROM
MONTHLY_COUNT A
ORDER BY
A.DTE DESC
)A
--COMPARE MONTH RANK 2 AND 1
LEFT JOIN(
SELECT
A.DTE,
SUM(CASE WHEN B.EMPLOYEE IS NULL THEN 1 ELSE 0 END) AS SEPARATIONS
FROM(
SELECT *
FROM
RANKING A
LEFT JOIN
raw_data B ON A.DTE = B.DATE
WHERE A.RANKING = '2' ----this is what I want to automate
)A
LEFT JOIN(
SELECT *
FROM
RANKING A
LEFT JOIN
raw_data B ON A.DTE = B.DATE
WHERE A.RANKING = '1' ----this is what I want to automate
)B ON A.Employee = B.Employee
GROUP BY
A.DTE
)B ON B.DTE = A.DTE
--COMPARE MONTH RANK 3 AND 2
LEFT JOIN(
SELECT
A.DTE,
SUM(CASE WHEN B.EMPLOYEE IS NULL THEN 1 ELSE 0 END) AS SEPARATIONS
FROM(
SELECT *
FROM
RANKING A
LEFT JOIN
raw_data B ON A.DTE = B.DATE
WHERE A.RANKING = '3'
)A
LEFT JOIN(
SELECT *
FROM
RANKING A
LEFT JOIN
raw_data B ON A.DTE = B.DATE
WHERE A.RANKING = '2'
)B ON A.Employee = B.Employee
GROUP BY
A.DTE
)C ON C.DTE = A.DTE
--COMPARE MONTH RANK 4 AND 3
LEFT JOIN(
SELECT
A.DTE,
SUM(CASE WHEN B.EMPLOYEE IS NULL THEN 1 ELSE 0 END) AS SEPARATIONS
FROM(
SELECT *
FROM
RANKING A
LEFT JOIN
raw_data B ON A.DTE = B.DATE
WHERE A.RANKING = '4'
)A
LEFT JOIN(
SELECT *
FROM
RANKING A
LEFT JOIN
raw_data B ON A.DTE = B.DATE
WHERE A.RANKING = '3'
)B ON A.Employee = B.Employee
GROUP BY
A.DTE
)D ON D.DTE = A.DTE ORDER BY A.DTE DESC
The main problem is how can I automate this to where I'm not explicitly comparing months 2 and 1, and then comparing months 3 and 2, and then comparing months 4 and 3, and so on ?
CodePudding user response:
There is no need to really do this recursively - you can do it using sets of data. .
There are two approaches below - they check, for each employee and month, whether that employee has a later record. The first approach uses LEAD to get later records for an individual; the second approach explicitly checks the next month.
The two approaches will differ if a person 'comes back' several months after leaving - the first approach wouldn't treat it as a departure (as they will have a later record), while the second would (as they did not have a record for that specific month).
This db<>fiddle has the SQL below so you can examine it operating.
Initial table setup:
CREATE TABLE #EmpMonths (EmpDate date, EmployeeID int);
INSERT INTO #EmpMonths (EmpDate, EmployeeID) VALUES
('20220401', 9),
('20220401', 8),
('20220301', 9),
('20220301', 8),
('20220301', 7),
('20220201', 9),
('20220201', 8),
('20220201', 7),
('20220101', 9),
('20220101', 8),
('20220101', 7),
('20220101', 6),
('20220101', 5);
As there are no corresponding later records for the 'latest' record (1 April 2022), in this I will report only months prior to the most recent record For this I use MaxDate
. Feel free to use any other approach.
DECLARE @MaxEmpDate date;
SET @MaxEmpDate = (SELECT MAX(EmpDate) FROM #EmpMonths);
Approach 1: Using LEAD
The following SQL gets a 2nd date associated with each employee/month record. This has the month they next appear in the dataset. (Note that I have included it here for your info only - it gets used in the answer in the next block of code).
SELECT EmployeeID, EmpDate, LEAD(EmpDate, 1) OVER (PARTITION BY EmployeeID ORDER BY EmpDate) AS Later_Date
FROM #EmpMonths;
You can then use the above to get the departures by month - a NULL in the Later_Date
column indicates they didn't have a later record - so you can count NULL records by month to get your answer.
WITH EmpRecords AS
(SELECT EmpDate, LEAD(EmpDate, 1) OVER (PARTITION BY EmployeeID ORDER BY EmpDate) AS Later_Date
FROM #EmpMonths
)
SELECT EmpDate, COUNT(CASE WHEN Later_Date IS NULL THEN 1 ELSE NULL END) AS Num_Departures
FROM EmpRecords
WHERE EmpDate < @MaxEmpDate
GROUP BY EmpDate;
Approach 2: Explicitly checking the next month
The alternative approach is to explicitly check the next month, to see if they have a record for that month. The following does this with a LEFT JOIN on the same table (joining on EmployeeID
, and 1 month later).
SELECT CurrentMonth.EmployeeID, CurrentMonth.EmpDate, NextMonth.EmpDate AS Later_Date
FROM #EmpMonths AS CurrentMonth
LEFT OUTER JOIN #EmpMonths AS NextMonth ON CurrentMonth.EmployeeID = NextMonth.EmployeeID
AND DATEADD(month, 1, CurrentMonth.EmpDate) = NextMonth.EmpDate
WHERE CurrentMonth.EmpDate < @MaxEmpDate;
Similar to above, you can then count NULLs for each month.
WITH EmpRecords AS
(SELECT CurrentMonth.EmployeeID, CurrentMonth.EmpDate, NextMonth.EmpDate AS Later_Date
FROM #EmpMonths AS CurrentMonth
LEFT OUTER JOIN #EmpMonths AS NextMonth ON CurrentMonth.EmployeeID = NextMonth.EmployeeID
AND DATEADD(month, 1, CurrentMonth.EmpDate) = NextMonth.EmpDate
WHERE CurrentMonth.EmpDate < @MaxEmpDate
)
SELECT EmpDate, COUNT(CASE WHEN Later_Date IS NULL THEN 1 ELSE NULL END) AS Num_Departures
FROM EmpRecords
GROUP BY EmpDate;
Results for both are below.
EmpDate Num_Departures
2022-01-01 2
2022-02-01 0
2022-03-01 1
CodePudding user response:
Try the following :
select [date], SUM(ch) departures
from
(
select [date],
case
when [date] = (select max([date]) from table_name)
then null
when not exists(select 1 from table_name D where D.[date]=DATEADD(month, 1, T.[date]) and D.employee_id=T.employee_id)
then 1
else 0
end as ch
from table_name T
) A
group by [date]
order by [date] desc
See a demo.
when [date] = (select max([date]) from table_name)
to select last month values as null
since there is no next month.
when not exists(select 1 ...
for each row in the table, checks if an employee is not existed in the next month of the current row month, if the condition is met then return 1 else return 0 (the employee is existed in current row month and next month).