I want to find how many times a consecutive specific result happens:
Date Car Result
2021-10-07 16:33:44.270 | Toyota | PASS
2021-10-07 18:35:08.073 | Ford | PASS
2021-10-07 18:39:31.497 | Jeep | FAIL
2021-10-07 21:30:46.150 | Jeep | FAIL
2021-10-08 03:38:55.370 | Toyota | FAIL
2021-10-08 06:43:07.597 | Ford | PASS
2021-10-08 07:10:22.897 | Toyota | FAIL
2021-10-08 07:39:56.810 | Jeep | PASS
2021-10-08 10:45:39.260 | Toyota | PASS
2021-10-08 11:29:45.123 | Ford | FAIL
I want to increase the number of consecutive failures when a FAIL result appears, and reset to 0 when a PASS result is achieved:
Date Car Failures
2021-10-07 16:33:44.270 | Toyota | 0
2021-10-07 18:35:08.073 | Ford | 0
2021-10-07 18:39:31.497 | Jeep | 1
2021-10-07 21:30:46.150 | Jeep | 2
2021-10-08 03:38:55.370 | Toyota | 1
2021-10-08 06:43:07.597 | Ford | 0
2021-10-08 07:10:22.897 | Toyota | 2
2021-10-08 07:39:56.810 | Jeep | 0
2021-10-08 10:45:39.260 | Toyota | 0
2021-10-08 11:29:45.123 | Ford | 1
Can anybody help me to find the right (T)SQL query?
CodePudding user response:
This is not a pretty one, but I believe it satisfies your conditions as well as produces the desired result set when running using only your sample data.
I added some extra edge cases to demonstrate that it is working correctly.
DECLARE @Table TABLE (
Date datetime
,Car varchar(50)
,Result varchar(4)
);
INSERT INTO @Table VALUES
('2021-10-07 16:33:44.270','Toyota','PASS')
,('2021-10-07 18:35:08.073','Ford','PASS')
,('2021-10-07 18:39:31.497','Jeep','FAIL')
,('2021-10-07 21:30:46.150','Jeep','FAIL')
,('2021-10-08 03:38:55.370','Toyota','FAIL')
,('2021-10-08 06:43:07.597','Ford','PASS')
,('2021-10-08 07:10:22.897','Toyota','FAIL')
,('2021-10-08 07:39:56.810','Jeep','PASS')
,('2021-10-08 10:45:39.260','Toyota','PASS')
,('2021-10-08 11:29:45.123','Ford','FAIL')
,('2021-10-09 18:39:31.497','Jeep','FAIL')
,('2021-10-09 21:30:46.150','Jeep','FAIL')
,('2021-10-10 18:39:31.497','Jeep','PASS')
,('2021-10-10 21:30:46.150','Jeep','FAIL')
,('2021-10-10 10:45:39.260','Toyota','FAIL');
SELECT
pass_grouping.Date
,pass_grouping.Car
/*now perform final sum of your counter for consecutive failures (AKA anything other than 'PASS')*/
,SUM(CASE WHEN pass_grouping.Result = 'PASS' THEN 0 ELSE 1 END) OVER(PARTITION BY pass_grouping.Car,pass_grouping.PassGrouperId ORDER BY pass_grouping.Date) AS Failures
FROM
/*generate a PassGrouperId to indicate a group of rows in which where there was no 'PASS' ocurrence (which would be indicated by incremented the value of pass_counter.PassOcurrenceRunningTotal).*/
/*The first row of each PassGrouperId group indicates where we need to reset the value of our final column 'Failures'. So we can partition outer query on that value.*/
(SELECT
pass_counter.Date
,pass_counter.Car
,pass_counter.Result
,RANK() OVER(PARTITION BY pass_counter.Car ORDER BY pass_counter.PassOcurrenceRunningTotal) AS PassGrouperId
FROM
/*create indicator that tracks 'PASS' Result ocurrence chronologically (using Date column) per each Car value*/
(SELECT
tbl.Date
,tbl.Car
,tbl.Result
,SUM(CASE WHEN tbl.Result = 'PASS' THEN 1 ELSE 0 END) OVER(PARTITION BY Car ORDER BY Date) AS PassOcurrenceRunningTotal
FROM
@Table AS tbl) AS pass_counter) AS pass_grouping
ORDER BY
pass_grouping.Date;
CodePudding user response:
EDIT: My answer here is wrong because when the values in Result switch back to PASS or FAIL (according to the ordering in the ROW_NUMBER() OVER() clause), the numbering continues where it left off rather than resetting again to 1.
You can accomplish this with ROW_NUMBER() and some CASE
logic like so:
-- Assuming the example data you posted is in a temp table named #TBL
WITH ResultData AS (
SELECT *,
ROW_NUMBER() OVER
(
-- Reset the count whenever the car or result changes
PARTITION BY
Car,
Result
-- Order first by car, and then by date so that the car won't change until we're done with it
ORDER BY
Car,
Date
)
AS ResultCount
FROM #TBL
)
SELECT Date, Car,
CASE WHEN Result = 'PASS' THEN 0 ELSE ResultCount END AS Failures
FROM ResultData
ORDER BY Date;
This yields the result example you posted. If you want to build out the temp table and try it for yourself, then add these statements before the above query:
CREATE TABLE #TBL
(
Date datetime,
Car nvarchar(50),
Result nvarchar(4)
);
INSERT INTO #TBL VALUES
( '2021-10-07 16:33:44.270','Toyota','PASS' ),
( '2021-10-07 18:35:08.073','Ford','PASS' ),
( '2021-10-07 18:39:31.497','Jeep','FAIL' ),
( '2021-10-07 21:30:46.150','Jeep','FAIL' ),
( '2021-10-08 03:38:55.370','Toyota','FAIL' ),
( '2021-10-08 06:43:07.597','Ford','PASS' ),
( '2021-10-08 07:10:22.897','Toyota','FAIL' ),
( '2021-10-08 07:39:56.810','Jeep','PASS' ),
( '2021-10-08 10:45:39.260','Toyota','PASS' ),
( '2021-10-08 11:29:45.123','Ford','FAIL' );
(Make sure you include the ;
at the end of the INSERT
statement, otherwise it will choke on the WITH
that starts the query.)