Home > Blockchain >  Grouping by Date inclusivity
Grouping by Date inclusivity

Time:08-24

Here is the data I'm working with here

Accountid Month
123 08/01/2021
123 09/01/2021
123 03/01/2022
123 04/01/2022
123 05/01/2022
123 06/01/2022

I'm trying to insert into a new table where the data is like this

Accountid Start Month End Month
123 08/01/2021 09/01/2021
123 03/01/2022 06/01/2022

I'm not sure how to separate them with the gap, and group by the account id in this case.

Thanks in advance

CodePudding user response:

That's a gaps and islands problem; one option to do it is:

Sample data:

SQL> with test (accountid, month) as
  2    (select 123, date '2021-01-08' from dual union all
  3     select 123, date '2021-01-09' from dual union all
  4     select 123, date '2021-01-03' from dual union all
  5     select 123, date '2021-01-04' from dual union all
  6     select 123, date '2021-01-05' from dual union all
  7     select 123, date '2021-01-06' from dual
  8    ),

Query begins here:

  9  temp as
 10    (select accountid, month,
 11       to_char(month, 'J') - row_number() Over
 12         (partition by accountid order by month) diff
 13     from test
 14    )
 15  select accountid,
 16    min(month) as start_month,
 17    max(month) as end_Month
 18  from temp
 19  group by accountid, diff
 20  order by accountid, start_month;

 ACCOUNTID START_MONT END_MONTH
---------- ---------- ----------
       123 03/01/2021 06/01/2021
       123 08/01/2021 09/01/2021

SQL>

Although related to MS SQL Server, have a look at Introduction to Gaps and Islands Analysis; should be interesting reading for you, I presume.

CodePudding user response:

In 12c you may also use match_recognize for gaps-and-islands problems to define grouping rules (islands) in a more readable and natural way.

select *
from input_
match_recognize(
  partition by accountid
  order by month asc
  measures
    first(month) as start_month,
    last(month) as end_month
  /*Any month followed by any number of subsequent month */
  pattern(any_ next*)
  define
    /*Next is the month right after the previous one*/
    next as months_between(month, prev(month)) = 1
)
ACCOUNTID START_MONTH END_MONTH
123 2021-08-01 2021-09-01
123 2022-03-01 2022-06-01

db<>fiddle here

  • Related