Home > Software design >  T-SQL Grouping Dynamic Date Ranges
T-SQL Grouping Dynamic Date Ranges

Time:10-04

Using MS SQL Server 2019

I have a set of recurring donation records. Each have a First Gift Date and a Last Gift Date associated with them. I need to add a GroupedID to these rows so that I can get the full date range for the earliest FirstGiftDate and the oldest LastGiftDate as long as there is not a break of more than 45 days in between the recurring donations.

For example Bob is a long time supporter. His card has expired multiple times and he has always started new gifts within 45 days. All of his gifts need to be given a single grouped ID. On the opposite side June has been donating and her card expires. She doesn't give again for 6 months, but then continues to give after her card expires. The first gift of Junes should get its own "GroupedID" and the second and third should be grouped together.The grouping count should restart with each donor.

My initial attempt was to join the donation table back to itself aliased as D2. This did work to give me an indicator of which ones were within the 45 day mark but I can't wrap my head around how to then link them. My only thought was to use LEAD and LAG to try analyze each scenario and figure out the different combinations of LEAD and LAG values needed to make it catch each different scenario, but that doesn't seem as reliable as scaleable as I'd like it to be.

I appreciate any help anyone can give.

My code:

SELECT #Donation.*, D2.*    
FROM #Donation
LEFT JOIN #Donation D2 ON #Donation.RecurringGiftID <> D2.RecurringGiftID
                       AND #Donation.Donor = D2.Donor 
                       AND ABS(DATEDIFF(DAY, #Donation.FirstGiftDate, D2.LastGiftDate)) < 45

Table structure and sample data:

CREATE TABLE #Donation 
(
    RecurringGiftID int, 
    Donor nvarchar(25), 
    FirstGiftDate date, 
    LastGiftDate date
)

INSERT INTO #Donation 
VALUES (1, 'Bob', '2017-02-15', '2018-07-01'),
       (15, 'Bob', '2018-08-05', '2019-04-01'),
       (32, 'Bob', '2019-04-15', '2022-06-15'),
       (54, 'June', '2015-05-01', '2016-05-01'),
       (96, 'June', '2016-12-15', '2018-02-01'),
       (120, 'June', '2018-03-04', '2020-07-01')

Desired output:

RecurringGiftId Donor FirstGiftDate LastGiftDate GroupedID
1 Bob 2017-02-15 2018-07-01 1
15 Bob 2018-08-05 2019-04-01 1
32 Bob 2019-04-15 2022-06-15 1
54 June 2015-05-01 2016-05-01 1
96 June 2016-12-15 2018-02-01 2
120 June 2018-03-04 2020-07-01 2

CodePudding user response:

use LAG() to detect when current row is more than 45 days from previous and perform a cumulative sum to form the required Group ID

select *, 
       GroupedID = sum(g) over (partition by Donor order by FirstGiftDate)
from
(
    select *,
           g = case when datediff(day, 
                                  lag(LastGiftDate, 1, '19000101') over (partition by Donor
                                                                      order by FirstGiftDate),
                                  FirstGiftDate)
                     > 45
                then 1
                else 0
                end
    from   #Donation
) d
  • Related