I have a table History with the columns date, person and status and I need to know what is the total amount of time spent since it started until it reaches the finished status ( Finished status can occur multiples times). I need to get the datediff from the first time it's created until the first time it's with status finished, afterwards I need to get the next date were it's not finished and get again the datediff using the date it was again finished and so on. Another condition is to do this calculation only if Person who changed the status is not null. After that I need to sum all times and get the total.
I tried with Lead and Lag function but was not getting the results that I need.
CodePudding user response:
First let's talk about providing demo data. Here's a good way to do it: Create a table variable similar to your actual object(s) and then populate them:
DECLARE @statusTable TABLE (Date DATETIME, Person INT, Status NVARCHAR(10), KeyID NVARCHAR(7))
INSERT INTO @statusTable (Date, Person, Status, KeyID) VALUES
('2022-10-07 07:01:17.463', 1, 'Start', 'AAA-111'),
('2022-10-07 07:01:17.463', 1, 'Waiting', 'AAA-111'),
('2022-10-11 14:01:44.463', 1, 'Waiting', 'AAA-111'),
('2022-10-14 10:04:17.463', 1, 'Waiting', 'AAA-111'),
('2022-10-14 10:04:17.463', 1, 'Finished','AAA-111'),
('2022-10-14 10:04:17.463', 1, 'Waiting', 'AAA-111'),
('2022-10-17 17:01:17.463', 1, 'Waiting', 'AAA-111'),
('2022-10-21 11:03:17.463', 1, 'Waiting', 'AAA-111'),
('2022-10-21 11:03:17.463', 1, 'Finished','AAA-111'),
('2022-10-21 11:03:17.463', 1, 'Waiting', 'AAA-111'),
('2022-10-21 11:04:17.463', NULL, 'Waiting', 'AAA-111'),
('2022-10-21 11:05:17.463', 1, 'Finished','AAA-111')
Your problem is recursive, so we can use a rCTE to resolve it.
;WITH base AS (
SELECT *, CASE WHEN LAG(Status,1) OVER (PARTITION BY KeyID ORDER BY Date) <> 'Waiting' AND Status = 'Waiting' THEN 1 END AS isStart, ROW_NUMBER() OVER (PARTITION BY KeyID ORDER BY Date) AS rn
FROM @statusTable
), rCTE AS (
SELECT date AS startDate, date, Person, Status, KeyID, IsStart, rn
FROM base
WHERE isStart = 1
UNION ALL
SELECT a.startDate, r.date, r.Person, r.Status, a.KeyID, r.IsStart, r.rn
FROM rCTE a
INNER JOIN base r
ON a.rn 1 = r.rn
AND a.KeyID = r.KeyID
AND r.IsStart IS NULL
)
SELECT StartDate, MAX(date) AS FinishDate, KeyID, DATEDIFF(MINUTE,StartDate,MAX(Date)) AS Minutes
FROM rCTE
GROUP BY rCTE.startDate, KeyID
HAVING COUNT(Person) = COUNT(KeyID)
StartDate FinishDate KeyID Minutes
---------------------------------------------------------------
2022-10-07 07:01:17.463 2022-10-14 10:04:17.463 AAA-111 10263
2022-10-14 10:04:17.463 2022-10-21 11:03:17.463 AAA-111 10139
What we're doing here is finding, and marking the starts. Since when there is a Start row, the timestamp matches the first Waiting row and there isn't always a start row, we're gonna use the first waiting row as the start marker. Then, we go through and find the next Finish row for that KeyID.
Using this we can now group on the StartDate, Max the StatusDate (as FinishDate) and then use a DATEDIFF to calculate the difference.
Finally, we compare the count of KeyIDs to the count of Person. If there is a NULL value for Person the counts will not match, and we just discard the data.
CodePudding user response:
select min(date) as start
,max(date) as finish
,datediff(millisecond, min(date), max(date)) as diff_in_millisecond
,sum(datediff(millisecond, min(date), max(date))) over() as total_diff_in_millisecond
from
(
select *
,count(case when Status = 'Finished' then 1 end) over(order by date desc, status desc) as grp
,case when person is null then 0 else 1 end as flg
from t
) t
group by grp
having min(flg) = 1
order by start
start | finish | diff_in_millisecond | total_diff_in_millisecond |
---|---|---|---|
2022-10-07 07:01:17.4630000 | 2022-10-14 10:04:28.4730000 | 615791010 | 1242093518 |
2022-10-14 10:04:28.4730000 | 2022-10-21 11:03:06.7170000 | 608318244 | 1242093518 |
2022-10-26 12:46:14.7730000 | 2022-10-26 17:45:59.0370000 | 17984264 | 1242093518 |