I have the following table on SQL Server:
ID | FROM | TO | OFFER NUMBER |
---|---|---|---|
1 | 2022.01.02 | 9999.12.31 | 1 |
1 | 2022.01.02 | 2022.02.10 | 2 |
2 | 2022.01.05 | 2022.02.15 | 1 |
3 | 2022.01.02 | 9999.12.31 | 1 |
3 | 2022.01.15 | 2022.02.20 | 2 |
3 | 2022.02.03 | 2022.02.25 | 3 |
4 | 2022.01.16 | 2022.02.05 | 1 |
5 | 2022.01.17 | 2022.02.13 | 1 |
5 | 2022.02.05 | 2022.02.13 | 2 |
The range includes the start date but excludes the end date. The date 9999.12.31 is given (comes from another system), but we could use the last day of the current quarter instead. I need to find a way to determine the number of days when the customer sees exactly one, two, or three offers. The following picture shows the method upon id 3:
The expected results should be like (without using the last day of the quarter):
ID | # of days when the customer sees only 1 offer | # of days when the customer sees 2 offers | # of days when the customer sees 3 offers |
---|---|---|---|
1 | 2913863 | 39 | 0 |
2 | 41 | 0 | 0 |
3 | 2913861 | 24 | 17 |
4 | 20 | 0 | 0 |
5 | 19 | 8 | 0 |
I've found this article but it did not enlighten me. Also I have limited privileges that is I am not able to declare a variable for example so I need to use "basic" TSQL.
Please provide a detailed explanation besides the code.
Thanks in advance!
CodePudding user response:
Find all date points for each ID. For each date point, find the number of overlapping.
Refer to comments within query
with
dates as
(
-- get all date points
select ID, theDate = FromDate from offers
union -- union to exclude any duplicate
select ID, theDate = ToDate from offers
),
cte as
(
select ID = d.ID,
Date_Start = d.theDate,
Date_End = LEAD(d.theDate) OVER (PARTITION BY ID ORDER BY theDate),
TheCount = c.cnt
from dates d
cross apply
(
-- Count no of overlapping
select cnt = count(*)
from offers x
where x.ID = d.ID
and x.FromDate <= d.theDate
and x.ToDate > d.theDate
) c
)
select ID, TheCount, days = sum(datediff(day, Date_Start, Date_End))
from cte
where Date_End is not null
group by ID, TheCount
order by ID, TheCount
Result :
ID | TheCount | days |
---|---|---|
1 | 1 | 2913863 |
1 | 2 | 39 |
2 | 1 | 41 |
3 | 1 | 2913861 |
3 | 2 | 29 |
3 | 3 | 12 |
4 | 1 | 20 |
5 | 1 | 19 |
5 | 2 | 8 |
To get to the required format, use PIVOT
CodePudding user response:
The following will (for each ID) extract all distinct dates, construct non-overlapping date ranges to test, and will count up the number of offers per range. The final step is to sum and format.
The fact that the start dates are inclusive and the end dates are exclusive while sometimes non-intuitive for the human, actually works well in algorithms like this.
DECLARE @Data TABLE (Id INT, FromDate DATETIME, ToDate DATETIME, OfferNumber INT)
INSERT @Data
VALUES
(1, '2022-01-02', '9999-12-31', 1),
(1, '2022-01-02', '2022-02-10', 2),
(2, '2022-01-05', '2022-02-15', 1),
(3, '2022-01-02', '9999-12-31', 1),
(3, '2022-01-15', '2022-02-20', 2),
(3, '2022-02-03', '2022-02-25', 3),
(4, '2022-01-16', '2022-02-05', 1),
(5, '2022-01-17', '2022-02-13', 1),
(5, '2022-02-05', '2022-02-13', 2)
;
WITH Dates AS ( -- Gather distinct dates
SELECT Id, Date = FromDate FROM @Data
UNION --(distinct)
SELECT Id, Date = ToDate FROM @Data
),
Ranges AS ( --Construct non-overlapping ranges (The ToDate = NULL case will be ignored later)
SELECT ID, FromDate = Date, ToDate = LEAD(Date) OVER(PARTITION BY Id ORDER BY Date)
FROM Dates
),
Counts AS ( -- Calculate days and count offers per date range
SELECT R.Id, R.FromDate, R.ToDate,
Days = DATEDIFF(DAY, R.FromDate, R.ToDate),
Offers = COUNT(*)
FROM Ranges R
JOIN @Data D ON D.Id = R.Id
AND D.FromDate <= R.FromDate
AND D.ToDate >= R.ToDate
GROUP BY R.Id, R.FromDate, R.ToDate
)
SELECT Id
,[Days with 1 Offer] = SUM(CASE WHEN Offers = 1 THEN Days ELSE 0 END)
,[Days with 2 Offers] = SUM(CASE WHEN Offers = 2 THEN Days ELSE 0 END)
,[Days with 3 Offers] = SUM(CASE WHEN Offers = 3 THEN Days ELSE 0 END)
FROM Counts
GROUP BY Id
The WITH
clause introduces Common Table Expressions (CTEs) which progressively build up intermediate results until a final select can be made.
Results:
Id | Days with 1 Offer | Days with 2 Offers | Days with 3 Offers |
---|---|---|---|
1 | 2913863 | 39 | 0 |
2 | 41 | 0 | 0 |
3 | 2913861 | 24 | 17 |
4 | 20 | 0 | 0 |
5 | 19 | 8 | 0 |
Alternately, the final select could use a pivot. Something like:
SELECT Id,
[Days with 1 Offer] = ISNULL([1], 0),
[Days with 2 Offers] = ISNULL([2], 0),
[Days with 3 Offers] = ISNULL([3], 0)
FROM (SELECT Id, Offers, Days FROM Counts) C
PIVOT (SUM(Days) FOR Offers IN ([1], [2], [3])) PVT
ORDER BY Id
See This db<>fiddle for a working example.