Home > Mobile >  SQL Server: add column for rows since value changed
SQL Server: add column for rows since value changed

Time:10-29

I have a table that contains 3 columns: personID, weeknumber, and event. Event is 0 if there was no event for that person in that week and 1 if there was.

I need to create a new column weekssincelastevent which will be 0 for the week where event=1 and then 1,2,3,4 etc for the weeks afterwards. If there is a later event then it starts from 0 again. E.g.

personID weeknumber event weekssincelastevent
1 1 0 NULL
1 2 0 NULL
1 3 1 0
1 4 0 1
1 5 0 2
1 6 0 3
2 1 0 NULL
2 2 1 0
2 3 0 1
2 4 1 0
2 5 0 1

The column should be NULL before the first events and all values NULL where a personID never has event.

I can't think how to write this in SQL.

The table has ~600m rows (60m personIDs with 100 weeknumbers each, although some personIDs don't have all the weeknumbers).

Many thanks for any insight.

CodePudding user response:

This is a bit of a gaps and island problem here. The first part, in the CTE, puts the data into "groups". Each time there is an event that's a new group. it also calculates the number of weeks that past since the prior week (which is set to 0 for rows hosting an event). Then in the outer query we SUM the number of weeks past in each group, giving the number of weeks that have passed:

WITH Groups AS(
    SELECT PersonID,
           WeekNumber,
           Event,
           COUNT(CASE Event WHEN 1 THEN 1 END) OVER (PARTITION BY PersonID ORDER BY WeekNumber ASC
                                                     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Events,
           CASE Event WHEN 0 THEN WeekNumber - LAG(WeekNumber) OVER (PARTITION BY PersonID ORDER BY WeekNumber ASC) ELSE 0 END AS WeeksPassed
    FROM dbo.YourTable)
SELECT PersonID,
       WeekNumber,
       Event,
       CASE WHEN Events = 0 THEN NULL
            ELSE SUM(WeeksPassed) OVER (PARTITION BY PersonID, Events ORDER BY WeekNumber ASC
                                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
       END AS WeekSinceLastEvent
FROM Groups;

db<>fiddle

CodePudding user response:

You can do this with a conditional aggregate within a windowed function:

SELECT  t.PersonID,
        t.WeekNumber,
        t.Event,
        WeeksSinceLastEvent = t.WeekNumber - MAX(CASE WHEN t.Event = 1 THEN t.WeekNumber END) 
                                             OVER(PARTITION BY t.PersonID ORDER BY t.WeekNumber)
FROM    dbo.T AS t;

The key parts are:

  1. CASE WHEN t.Event = 1 THEN t.WeekNumber END Only consider week number where it is a valid event. Since MAX with ignore nulls this will only consider relevant rows
  2. OVER (PARTITION BY t.PersonID ORDER BY t.WeekNumber) - Only consider rows for the current person, where the weeknumber is lower than the current row.

Example on DB<>Fiddle

  • Related