Home > front end >  Grouping based on start date matching the previous row's end date SQL
Grouping based on start date matching the previous row's end date SQL

Time:02-18

Hoping someone can help me out with this problem.

I have the following sample dataset:

MEM_ID CLM_ID ADM_DT DCHG_DT
1 111 01-01-2020 02-01-2020
1 112 03-01-2020 04-01-2020
1 113 04-01-2020 05-01-2020
1 114 06-01-2020 07-01-2020
2 211 01-01-2020 02-01-2020
2 212 05-01-2020 08-01-2020
3 311 02-01-2020 03-01-2020
3 312 03-01-2020 05-01-2020
3 313 05-01-2020 06-01-2020
3 314 07-01-2020 08-01-2020

I am trying to create groupings based on MEM_ID. If a ADM_DT is equal to the previous DCHG_DT then the records should be grouped together

Below is the expected output:

MEM_ID CLM_ID ADM_DT DCHG_DT GROUP_ID
1 111 01-01-2020 02-01-2020 1
1 112 03-01-2020 04-01-2020 2
1 113 04-01-2020 05-01-2020 2
1 114 06-01-2020 07-01-2020 3
2 211 01-01-2020 02-01-2020 1
2 212 05-01-2020 08-01-2020 2
3 311 02-01-2020 03-01-2020 1
3 312 03-01-2020 05-01-2020 1
3 313 05-01-2020 06-01-2020 1
3 314 07-01-2020 08-01-2020 2

I have attempted the following:

select DISTINCT MEM_ID
,CLM_ID
,ADM_DT
,DCHG_DT
,CASE WHEN ADM_DT = LAG(DCHG_DT) OVER(PARTITION BY MEM_ID ORDER BY ADM_DT, DCHG_DT) THEN 0 ELSE 1 END AS ISSTART
FROM 
table

Which produces something like this:

MEM_ID CLM_ID ADM_DT DCHG_DT ISSTART
1 111 01-01-2020 02-01-2020 1
1 112 03-01-2020 04-01-2020 1
1 113 04-01-2020 05-01-2020 0
1 114 06-01-2020 07-01-2020 1
2 211 01-01-2020 02-01-2020 1
2 212 05-01-2020 08-01-2020 1
3 311 02-01-2020 03-01-2020 1
3 312 03-01-2020 05-01-2020 0
3 313 05-01-2020 06-01-2020 0
3 314 07-01-2020 08-01-2020 1

I have also looked into other external sources such as https://www.kodyaz.com/t-sql/sql-query-for-overlapping-time-periods-on-sql-server.aspx

This got me pretty close but I realized that the author was using a recursive CTE and Netezza does not support that function.

Ultimately I would like to create these groupings so that i can then merge to the original table that I am using and sum values based on the assigned group for each MEM_ID.

Thank you in advance for any help provided.

CodePudding user response:

Try this:

select MEM_ID, CLM_ID, ADM_DT, DCHG_DT,
sum(ISSTART) over(partition by MEM_ID order by ADM_DT, DCHG_DT rows unbounded preceding) as GROUP_ID from
(select MEM_ID
,CLM_ID
,ADM_DT
,DCHG_DT
,CASE WHEN ADM_DT = LAG(DCHG_DT) OVER(PARTITION BY MEM_ID ORDER BY ADM_DT, DCHG_DT) THEN 0 ELSE 1 END AS ISSTART
FROM 
table_name) t

Fiddle

Basically using your ISSTART in a sum to get the desired output.

  • Related