There's a table with three columns: start date, end date and task duration in hours. For example, something like that:
Id | StartDate | EndDate | Duration |
---|---|---|---|
1 | 07-11-2022 | 15-11-2022 | 40 |
2 | 02-09-2022 | 02-11-2022 | 122 |
3 | 10-10-2022 | 05-11-2022 | 52 |
And I want to get a table like that:
Id | Month | HoursPerMonth |
---|---|---|
1 | 11 | 40 |
2 | 09 | 56 |
2 | 10 | 62 |
2 | 11 | 4 |
3 | 10 | 42 |
3 | 11 | 10 |
Briefly, I wanted to know, how many working hours is in each month between start and end dates. Proportionally. How can I achieve that by MS SQL Query? Data is quite big so the query speed is important enough. Thanks in advance!
I've tried DATEDIFF
and EOMONTH
, but that solution doesn't work with tasks > 2 months. And I'm sure that this solution is bad decision. I hope, that it can be done more elegant way.
CodePudding user response:
Here is an option using an ad-hoc tally/calendar table
Not sure I'm agree with your desired results
Select ID
,Month = month(D)
,HoursPerMonth = (sum(1.0) / (1 max(datediff(DAY,StartDate,EndDate)))) * max(Duration)
From YourTable A
Join (
Select Top 75000 D=dateadd(day,Row_Number() Over (Order By (Select NULL)),0)
From master..spt_values n1, master..spt_values n2
) B on D between StartDate and EndDate
Group By ID,month(D)
Order by ID,Month
Results
CodePudding user response:
This answer uses CTE recursion.
This part just sets up a temp table with the OP's example data.
DECLARE @source
TABLE (
SOURCE_ID INT
,STARTDATE DATE
,ENDDATE DATE
,DURATION INT
)
;
INSERT
INTO
@source
VALUES
(1, '20221107', '20221115', 40 )
,(2, '20220902', '20221102', 122 )
,(3, '20221010', '20221105', 52 )
;
This part is the query based on the above data. The recursive CTE breaks the time period into months. The second CTE does the math. The final selection does some more math and presents the results the way you want to seem them.
WITH CTE AS (
SELECT
SRC.SOURCE_ID
,SRC.STARTDATE
,SRC.ENDDATE
,SRC.STARTDATE AS 'INTERIM_START_DATE'
,CASE WHEN EOMONTH(SRC.STARTDATE) < SRC.ENDDATE
THEN EOMONTH(SRC.STARTDATE)
ELSE SRC.ENDDATE
END AS 'INTERIM_END_DATE'
,SRC.DURATION
FROM
@source SRC
UNION ALL
SELECT
CTE.SOURCE_ID
,CTE.STARTDATE
,CTE.ENDDATE
,CASE WHEN EOMONTH(CTE.INTERIM_START_DATE) < CTE.ENDDATE
THEN DATEADD( DAY, 1, EOMONTH(CTE.INTERIM_START_DATE) )
ELSE CTE.STARTDATE
END
,CASE WHEN EOMONTH(CTE.INTERIM_START_DATE, 1) < CTE.ENDDATE
THEN EOMONTH(CTE.INTERIM_START_DATE, 1)
ELSE CTE.ENDDATE
END
,CTE.DURATION
FROM
CTE
WHERE
CTE.INTERIM_END_DATE < CTE.ENDDATE
)
, CTE2 AS (
SELECT
CTE.SOURCE_ID
,CTE.STARTDATE
,CTE.ENDDATE
,CTE.INTERIM_START_DATE
,CTE.INTERIM_END_DATE
,CAST( DATEDIFF( DAY, CTE.INTERIM_START_DATE, CTE.INTERIM_END_DATE ) 1 AS FLOAT ) AS 'MNTH_DAYS'
,CAST( DATEDIFF( DAY, CTE.STARTDATE, CTE.ENDDATE ) 1 AS FLOAT ) AS 'TTL_DAYS'
,CAST( CTE.DURATION AS FLOAT ) AS 'DURATION'
FROM
CTE
)
SELECT
CTE2.SOURCE_ID AS 'Id'
,MONTH( CTE2.INTERIM_START_DATE ) AS 'Month'
,ROUND( CTE2.MNTH_DAYS/CTE2.TTL_DAYS * CTE2.DURATION, 0 ) AS 'HoursPerMonth'
FROM
CTE2
ORDER BY
CTE2.SOURCE_ID
,CTE2.INTERIM_END_DATE
;
My results agree with Mr. Cappelletti's, not the OP's. Perhaps some tweaking regarding the definition of a "Day" is needed. I don't know.
If time between start and end date is large (more than 100 months) you may want to specify OPTION (MAXRECURSION 0)
at the end.