Home > Software design >  sql how to assign the same ID for the same group
sql how to assign the same ID for the same group

Time:12-21

I have a dataset as this:

ID         SESSION         DATE                              
1             A           2021/1/1       
1             A           2021/1/2      
1             B           2021/1/3       
1.            B           2021/1/4      
1             A           2021/1/5       
1             A           2021/1/6      

So what I want to create is the GROUP column which assigns the same row number for where ID column AND SESSION column is the same as below:

    ID         SESSION         DATE         GROUP                       
     1             A           2021/1/1        1
     1             A           2021/1/2        1
     1             B           2021/1/3        2
     1             B           2021/1/4        2
     1             A           2021/1/5        3
     1             A           2021/1/6        3

Does anyone know how to do this in SQL in an efficient way because I have about 5 billion rows? Thank you in advance!

CodePudding user response:

You have a kind of gaps and islands problem, you can create your groupings by counting when the session changes using lag, like so:

select Id, Session, Date,  
  Sum(case when session = prevSession then 0 else 1 end) over(partition by Id order by date) "Group"
from (
  select *, 
    Lag(Session) over(partition by Id order by date) prevSession
  from t
)t;

Example Fiddle using MySql but this is ansi SQL that should work in most DBMS.

  •  Tags:  
  • sql
  • Related