Home > Mobile >  Create groups based on 14 day interval
Create groups based on 14 day interval

Time:04-01

I know this is a common question, but I couldn't find something that matches my case. I have this data:

 --------- --------- 
| user_id | view_dt |
 --------- --------- 
| A       |     1/1 |
 --------- --------- 
| A       |    1/10 |
 --------- --------- 
| A       |    1/14 |
 --------- --------- 
| A       |    1/22 |
 --------- --------- 
| A       |    1/23 |
 --------- --------- 
| A       |    1/30 |
 --------- --------- 

I want this data to be grouped based on a 14 day interval. That is, the groups would be:

GROUP 1: 1/1, 1/10, 1/14

GROUP 2: 1/22, 1/23, 1/30

Note that I the 1/30 date should belong to GROUP 2 since 1/30 should be compared to the first date of GROUP 2 (1/22) instead of (1/1).

The problem I am having is my own query shows 1/30 belonging to group 3.

CREATE TABLE T (
    user_id VARCHAR(20),
    view_dt DATETIME
);

INSERT INTO t VALUES ('A', '2022-01-01');
INSERT INTO t VALUES ('A', '2022-01-10');
INSERT INTO t VALUES ('A', '2022-01-14');
INSERT INTO t VALUES ('A', '2022-01-22');
INSERT INTO t VALUES ('A', '2022-01-23');
INSERT INTO t VALUES ('A', '2022-01-30');

SELECT user_id,
    view_dt,
    DENSE_RANK() OVER(ORDER BY gr) grp
FROM (
SELECT 
    user_id,
    view_dt,
    CAST (view_dt - MIN (view_dt) OVER (PARTITION BY user_id ORDER BY view_dt) AS INT )/14   1 AS gr
    FROM T
) x
ORDER BY user_id

Ideal Output

 --------- --------- ------- 
| user_id | view_dt | group |
 --------- --------- ------- 
| A       |     1/1 |     1 |
 --------- --------- ------- 
| A       |    1/10 |     1 |
 --------- --------- ------- 
| A       |    1/14 |     1 |
 --------- --------- ------- 
| A       |    1/22 |     2 |
 --------- --------- ------- 
| A       |    1/23 |     2 |
 --------- --------- ------- 
| A       |    1/30 | 2     |
 --------- --------- ------- 

Output from my previous query:

 --------- --------- ------- 
| user_id | view_dt | group |
 --------- --------- ------- 
| A       |     1/1 |     1 |
 --------- --------- ------- 
| A       |    1/10 |     1 |
 --------- --------- ------- 
| A       |    1/14 |     1 |
 --------- --------- ------- 
| A       |    1/22 |     2 |
 --------- --------- ------- 
| A       |    1/23 |     2 |
 --------- --------- ------- 
| A       |    1/30 | 3**   |
 --------- --------- ------- 

CodePudding user response:

One option is to use Recursive CTE

-- Recursive CTE solution
with cte as
(
    -- CTE for adding a row_number
    select rn = row_number() over (partition by user_id order by view_dt),
           user_id, view_dt
    from   T
),
rcte as
(
    -- RCTE - anchor member
    -- first_dt is the first date of the group
    select rn, user_id, view_dt, grp = 1, first_dt = view_dt
    from   cte
    where  rn = 1
    
    union all
    
    -- RCTE - recursive member
    -- if date is more than 14 days from first_dt, grp   1, update first_dt
    select c.rn, c.user_id, c.view_dt,
           grp = case when datediff(day, r.first_dt, c.view_dt) > 14
                      then r.grp   1
                      else r.grp
                      end,
           first_dt = case when datediff(day, r.first_dt, c.view_dt) > 14
                      then c.view_dt
                      else r.first_dt
                      end
    from   cte c
           inner join rcte r on  c.user_id = r.user_id
                             and c.rn      = r.rn   1
)
select *
from   rcte

Note : Please do avoid using arithmetic operator on date. view_dt - MIN (view_dt). Should use datediff() See Bad Habits to Kick : Using shorthand with date/time operations

  • Related