Home > database >  Group by consecutive dates in TSQL
Group by consecutive dates in TSQL

Time:10-20

I am having trouble with the following problem. Using t-sql, starting from the following sample data in the table:

valid_from valid_to article price
2021-01-01 2021-01-02 AAA 10,99
2021-01-02 2021-01-03 AAA 10,99
2021-01-03 2021-01-05 AAA 10,99
2021-01-05 2021-01-06 AAA 12,99
2021-01-06 2021-01-09 AAA 12,99
2021-01-09 2021-01-14 AAA 10,99
2021-01-14 2021-01-15 AAA 10,99
2021-01-15 2021-01-16 AAA 10,99

I have to produce the "summarized" data table collapsing the data of the consecutive days when the price remained the same and producing just one line composed by the smallest valid_from and greatest valid_to. In this case the desired output would be:

valid_from valid_to article price
2021-01-01 2021-01-05 AAA 10,99
2021-01-05 2021-01-09 AAA 12,99
2021-01-09 2021-01-16 AAA 10,99

I tried with some queries, but I can't find a way to solve completely the problem.

DECLARE @TBL_DATA AS TABLE (
    valid_from DATE NOT NULL,
    valid_to DATE NOT NULL,
    article VARCHAR(20) NOT NULL,
    price FLOAT NOT NULL
)

INSERT INTO @TBL_DATA (
    valid_from,
    valid_to,
    article,
    price
)
VALUES
('2021-01-01','2021-01-02','AAA',10.99),
('2021-01-02','2021-01-03','AAA',10.99),
('2021-01-03','2021-01-05','AAA',10.99),
('2021-01-05','2021-01-06','AAA',12.99),
('2021-01-06','2021-01-09','AAA',12.99),
('2021-01-09','2021-01-14','AAA',10.99),
('2021-01-14','2021-01-15','AAA',10.99),
('2021-01-15','2021-01-16','AAA',10.99)

SELECT
    td.*,
    DENSE_RANK() OVER (ORDER BY td.article ASC, td.price ASC) AS dr
FROM @TBL_DATA AS td
WHERE 1 = 1
ORDER BY
    td.valid_from ASC
valid_from valid_to article price dr
2021-01-01 2021-01-02 AAA 10.99 1
2021-01-02 2021-01-03 AAA 10.99 1
2021-01-03 2021-01-05 AAA 10.99 1
2021-01-05 2021-01-06 AAA 12.99 2
2021-01-06 2021-01-09 AAA 12.99 2
2021-01-09 2021-01-14 AAA 10.99 1
2021-01-14 2021-01-15 AAA 10.99 1
2021-01-15 2021-01-16 AAA 10.99 1

As intermediate step I calculate the DENSE_RANK ordering by article and price, but the 3rd group (by price) is getting the same rank of the 1st. Can you please help me?

CodePudding user response:

you can use rank or row_number window functions to find islands & gaps:

select min(valid_from) valid_from, max(valid_to) valid_to, article, price
from (
   select * 
        , rank() over (partition by article order by valid_from) 
        - rank() over (partition by article,price order by valid_from) groups
   from TBL_DATA) t
group by groups , price, article
order by min(valid_from)

db<>fiddle here

  • Related