Classifying rows into a grouping column that shows the data is related to prior rows-CodePudding

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.

The data is as follows:

customer id	sequence id	prior_sequence id
1	1	0
1	2	1
1	3	2
2	4	0
2	5	4
2	6	0
2	7	6

Ideally, I would like to create the following grouping column and yield the following results:

customer id	sequence id	prior sequence id	grouping
1	1	0	1
1	2	1	1
1	3	2	1
2	4	0	2
2	5	4	2
2	6	0	3
2	7	6	3

I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.

CodePudding user response：

I agree that a recursive CTE will do the job. Something like:

WITH reccte AS
( 
    /*query that determines starting point for recursion
     *
     * In this case we want all records with no prior_sequence_id
     */
    SELECT 
        customer_id, 
        sequence_id, 
        prior_sequence_id, 
        /*establish grouping*/
        ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
    FROM yourtable
    WHERE prior_sequence_id = 0
    UNION 

    /*join the recursive CTe back to the table and iterate*/
    SELECT
        yourtable.customer_id,
        yourtable.sequence_id,
        yourtable.prior_sequence_id,
        reccte.grouping
    FROM reccte
        INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;

CodePudding user response：

It looks like you could use a simple correlated query, at least given your sample data:

select *, (
  select Sum(Iif(prior_sequence_id = 0, 1, 0)) 
  from t t2 
  where t2.sequence_id <= t.sequence_id
) Grouping
from t;

See Example Fiddle