Home > Net >  Find longest increasing sequence in groups of consecutive dates using SQL
Find longest increasing sequence in groups of consecutive dates using SQL

Time:08-10

Using MARIA-DB 10.6

I have a table user containing columns dates and solved for each row. Dates will always be unique (only 1 row of any date, if present). The solved count can be same, or higher, as the dates roll by.

Here's an example table user:

date solved
2010-11-26 2
2010-11-27 4
2010-11-28 6
2010-11-29 10
2010-12-05 11
2010-12-06 11
2010-12-07 11
2010-12-08 12
2010-12-09 12
2010-12-10 12

What I need to find from this is: the longest sequence of consecutive dates, where the solved count is increasing.

Now the way I approached it is that there's 2 parts to this. First part is finding the groups of consecutive days. Following this article, I have achieved that with this query:

WITH consecutiveDateGroups AS (
  SELECT
   ROW_NUMBER() OVER (ORDER BY date) AS rowNum,
   DATE_ADD(date, INTERVAL (-ROW_NUMBER() OVER (ORDER BY date)) DAY) AS consecutiveDateGroup,
   date,
   solved
  FROM user
)

SELECT *
FROM consecutiveDateGroups
ORDER BY rowNum

In my case, we have 3 groups of consecutive dates:

[group 1] -> 2010-11-26 to 2010-11-29 (4 days)

[group 2] -> 2010-12-05 to 2010-12-10 (6 days)

I can group by the number of consecutive days in each group with this query:

WITH 
            
consecutiveDateGroups AS (
  SELECT
   ROW_NUMBER() OVER (ORDER BY date) AS rowNum,
   DATE_ADD(date, INTERVAL (-ROW_NUMBER() OVER (ORDER BY date)) DAY) AS consecutiveDateGroup,
   date
  FROM user
)
            
SELECT
  COUNT(*) AS consecutiveDates,
  MIN(date) AS minDate,
  MAX(date) AS maxDate
FROM consecutiveDateGroups
GROUP BY consecutiveDateGroup

The result I have so far is:

consecutiveDates minDate maxDate
4 2010-11-26 2010-11-29
6 2010-12-05 2010-12-10

I am all good so far. But after this is where things get muddy for me, and I am stuck. How do I return the number of increasing solved columns for each consecutive date group?

For example, in this case:

[group 1] has an increasing sequence of 4 [2,4,6,10]

[group 2] has an increasing sequence of 2 [11,12]

So the result I want is:

consecutiveDates minDate maxDate increasingSolvedSequence
4 2010-11-26 2010-11-29 4
6 2010-12-05 2010-12-10 2

Any help would be appreciated!

Fiddle: https://dbfiddle.uk/?rdbms=mariadb_10.6&fiddle=c80ffb091bb272b0c3b2ae8abbbbbaf1

CodePudding user response:

Try the following:

set @cn = 1;
WITH consecutiveDateGroups AS 
(
  SELECT
   ROW_NUMBER() OVER (ORDER BY date) AS rowNum,
   DATE_ADD(date, INTERVAL (-ROW_NUMBER() OVER (ORDER BY date)) DAY) AS consecutiveDateGroup,
   date,
   solved
  FROM user
)
, 
cte AS
(
 SELECT *,
 COALESCE(LEAD(solved) OVER (PARTITION BY consecutiveDateGroup ORDER BY date), solved) AS LD
 FROM consecutiveDateGroups
 ORDER BY rowNum
),
cte2 as 
(
 SELECT *,
 CASE WHEN solved <> ld THEN @cn:=@cn 1 ELSE @cn:=1 END AS ch
 FROM cte
)
SELECT
  COUNT(*) AS consecutiveDates,
  MIN(date) AS minDate,
  MAX(date) AS maxDate,
  MAX(ch) AS increasingSolvedSequence  
FROM cte2
GROUP BY consecutiveDateGroup

See a demo from db<>fiddle.

  • Related