Group by consecutive dates in TSQL-CodePudding

I am having trouble with the following problem. Using t-sql, starting from the following sample data in the table:

valid_from	valid_to	article	price
2021-01-01	2021-01-02	AAA	10,99
2021-01-02	2021-01-03	AAA	10,99
2021-01-03	2021-01-05	AAA	10,99
2021-01-05	2021-01-06	AAA	12,99
2021-01-06	2021-01-09	AAA	12,99
2021-01-09	2021-01-14	AAA	10,99
2021-01-14	2021-01-15	AAA	10,99
2021-01-15	2021-01-16	AAA	10,99

I have to produce the "summarized" data table collapsing the data of the consecutive days when the price remained the same and producing just one line composed by the smallest valid_from and greatest valid_to. In this case the desired output would be:

valid_from	valid_to	article	price
2021-01-01	2021-01-05	AAA	10,99
2021-01-05	2021-01-09	AAA	12,99
2021-01-09	2021-01-16	AAA	10,99

I tried with some queries, but I can't find a way to solve completely the problem.

DECLARE @TBL_DATA AS TABLE (
    valid_from DATE NOT NULL,
    valid_to DATE NOT NULL,
    article VARCHAR(20) NOT NULL,
    price FLOAT NOT NULL
)

INSERT INTO @TBL_DATA (
    valid_from,
    valid_to,
    article,
    price
)
VALUES
('2021-01-01','2021-01-02','AAA',10.99),
('2021-01-02','2021-01-03','AAA',10.99),
('2021-01-03','2021-01-05','AAA',10.99),
('2021-01-05','2021-01-06','AAA',12.99),
('2021-01-06','2021-01-09','AAA',12.99),
('2021-01-09','2021-01-14','AAA',10.99),
('2021-01-14','2021-01-15','AAA',10.99),
('2021-01-15','2021-01-16','AAA',10.99)

SELECT
    td.*,
    DENSE_RANK() OVER (ORDER BY td.article ASC, td.price ASC) AS dr
FROM @TBL_DATA AS td
WHERE 1 = 1
ORDER BY
    td.valid_from ASC

valid_from	valid_to	article	price	dr
2021-01-01	2021-01-02	AAA	10.99	1
2021-01-02	2021-01-03	AAA	10.99	1
2021-01-03	2021-01-05	AAA	10.99	1
2021-01-05	2021-01-06	AAA	12.99	2
2021-01-06	2021-01-09	AAA	12.99	2
2021-01-09	2021-01-14	AAA	10.99	1
2021-01-14	2021-01-15	AAA	10.99	1
2021-01-15	2021-01-16	AAA	10.99	1

As intermediate step I calculate the DENSE_RANK ordering by article and price, but the 3rd group (by price) is getting the same rank of the 1st. Can you please help me?

CodePudding user response：

you can use rank or row_number window functions to find islands & gaps:

select min(valid_from) valid_from, max(valid_to) valid_to, article, price
from (
   select * 
        , rank() over (partition by article order by valid_from) 
        - rank() over (partition by article,price order by valid_from) groups
   from TBL_DATA) t
group by groups , price, article
order by min(valid_from)

db<>fiddle here