This is a follow-up question to my initial post
Example Situation: An order system tracks manually entered due dates by recording a system log date that is always unique (this would be a datetime, but I've used dates for simplicity, making each unique).
I would like to assign a section number to each due date grouping where the due date remains the same chronologically.
Stu's response solved the table in my initial post, but I've noticed that if I replace the 4/15/2022 due date associated with SysLogDate of 1/16/2022 to be 4/13/2022, the desired ordering does not seem to be maintained:
Note: 4/13/2022 date is an arbitrary change. The same issue occurs if I use any other unique date that is not yet already in the DueDate column. Ultimately, I also need to be able to handle changes to/from NULL, where someone 'forgets' to enter the date, but replacing the date with NULL also yields the same issue.
Updated Table:
CREATE TABLE #DueDates (OrderNo INT, DueDate Date, SysLogDate Date)
INSERT INTO #DueDates Values (1, '4/10/2022', '1/10/2022')
,(1, '4/10/2022', '1/11/2022')
,(1, '4/15/2022', '1/15/2022')
,(1, '4/13/2022', '1/16/2022') -- Due Date Altered since prior post
,(1, '4/15/2022', '1/17/2022')
,(1, '4/10/2022', '1/18/2022')
,(1, '4/10/2022', '1/19/2022')
,(1, '4/10/2022', '1/20/2022')
,(2, '4/10/2022', '2/16/2022')
,(2, '4/10/2022', '2/17/2022')
,(2, '4/15/2022', '2/18/2022')
,(2, '4/15/2022', '2/20/2022')
,(2, '4/15/2022', '2/21/2022')
,(2, '4/10/2022', '2/22/2022')
,(2, '4/10/2022', '2/24/2022')
,(2, '4/10/2022', '2/26/2022')
Desired Results Are:
OrderNo DueDate SysLogDate SectionNumber_WithinDueDate
1 2022-04-10 2022-01-10 1
1 2022-04-10 2022-01-11 1
1 2022-04-15 2022-01-15 2
1 2022-04-13 2022-01-16 3
1 2022-04-15 2022-01-17 4
1 2022-04-10 2022-01-18 5
1 2022-04-10 2022-01-19 5
1 2022-04-10 2022-01-20 5
2 2022-04-10 2022-02-16 1
2 2022-04-10 2022-02-17 1
2 2022-04-15 2022-02-18 2
2 2022-04-15 2022-02-20 2
2 2022-04-15 2022-02-21 2
2 2022-04-10 2022-02-22 3
2 2022-04-10 2022-02-24 3
2 2022-04-10 2022-02-26 3
...but applying the solution from my prior post to this updated table yields:
OrderNo DueDate SysLogDate SectionNumber_WithinDueDate
1 2022-04-10 2022-01-10 1
1 2022-04-10 2022-01-11 1
1 2022-04-15 2022-01-15 2
1 2022-04-13 2022-01-16 3 **
1 2022-04-15 2022-01-17 3 **
1 2022-04-10 2022-01-18 3 **
1 2022-04-10 2022-01-19 3 **
1 2022-04-10 2022-01-20 3 **
2 2022-04-10 2022-02-16 1
2 2022-04-10 2022-02-17 1
2 2022-04-15 2022-02-18 2
2 2022-04-15 2022-02-20 2
2 2022-04-15 2022-02-21 2
2 2022-04-10 2022-02-22 3
2 2022-04-10 2022-02-24 3
2 2022-04-10 2022-02-26 3
Here's a demo to work that uses the above updated table and the solution from my prior post, and shows the above non-desired results: Fiddle
Demo showing same effect when the date is replaced with NULL: Fiddle with NULL
Copy of the selected solution from my prior post (used in the above Fiddles):
select OrderNo, DueDate, SysLogDate,
dense_rank() over(partition by orderno order by gp) SectionNumber_WithinDueDate
from (
select *,
Row_Number() over(partition by OrderNo order by SysLogDate)
- Row_Number() over(partition by OrderNo, DueDate order by SysLogDate) gp
from #DueDates
)t
order by OrderNo, SysLogDate;
It's a small change in the data, but I haven't been able to work out how to alter the 'Row_Number difference line' in the subquery to get the desired results.
Thank you for any advice you can offer here :)
CodePudding user response:
For gap and island problem, I prefer to use lag() window function as it is easier to understand.
Use lag()
to compare previous row value and when changed, set a flag (value 1). Perform a cumulative sum on the flag and you get the grp
. Use dense_rank()
on the grp
and it gives you your SectionNumber_WithinDueDate
As you have NULL
value, use ISNULL()
to return a date value (99991231
) for comparison
select OrderNo, DueDate, SysLogDate,
SectionNumber_WithinDueDate = dense_rank() over (partition by OrderNo
order by grp)
from
(
select *, grp = sum(g) over (partition by OrderNo
order by SysLogDate)
from
(
select *,
g = case when isnull(DueDate, '99991231')
<> isnull(lag(DueDate) over (partition by OrderNo
order by SysLogDate), '99991231')
then 1
else 0
end
from #DueDates
) d
) d
order by OrderNo, SysLogDate;
Fiddle on your sample data :