id| startdate | enddate | team | score
1| 2011-09-25 | 2012-03-31 | blue | 380
1| 2012-04-01 | 2012-07-31 | blue | 147
1| 2012-08-01 | 2013-01-31 | green| 217
1| 2013-02-01 | 2015-02-28 | green| 1540
1| 2015-03-01 | 2015-09-30 | green| 300
1| 2015-10-01 | 2018-04-30 | blue | 2193
1| 2018-05-01 | 2020-12-31 | blue | 1743
1| 2023-05-01 | 2023-09-30 | blue | 159
1| 2023-10-01 | 9999-12-31 | green| *
I need to merge the continuous time periods in this view where the team is the same and add the scores to get that team's total score in that period.
Note that, the value for the last time frame is * because it is the current time frame, and the value is only inserted after the time period is finished. Also, all the values for the score are in string format not integer format.
Expected output:
id| startdate| enddate| team| score
1| 2011-09-25| 2012-07-31| blue| 527
1| 2012-08-01| 2015-09-30| green| 2057
1| 2015-10-01| 2020-12-31| blue| 3963
1| 2023-05-01| 2023-09-30| blue| 159
1| 2023-10-01| 9999-12-31| green| *
CodePudding user response:
This is a gaps and islands problem, you may try the following using ROW_NUMBER
function to define groups for the consecutive teams, then using DATEDIFF
define another groups within a team to check if there is break in the dates.
WITH grps AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY startdate) -
ROW_NUMBER() OVER (PARTITION BY id, team ORDER BY startdate) grp
FROM table_name
)
SELECT id, MIN(startdate) startdate, MAX(enddate) enddate, team,
CASE WHEN MAX(score) = '*'
THEN '*'
ELSE SUM(score)
END AS score
FROM
(
SELECT *, COALESCE(DATEDIFF(startdate, LAG(enddate) OVER (PARTITION BY grp ORDER BY enddate)), 1) grp2
FROM grps
) T
GROUP BY id, team, grp, grp2
ORDER BY id, startdate
See a demo.