Home > Back-end >  Classifying rows into a grouping column that shows the data is related to prior rows
Classifying rows into a grouping column that shows the data is related to prior rows

Time:08-18

I have a set of data that I want to classify into groups based on a prior record id existing on the newer rows. The initial record of the group has a prior sequence id = 0.

The data is as follows:

customer id sequence id prior_sequence id
1 1 0
1 2 1
1 3 2
2 4 0
2 5 4
2 6 0
2 7 6

Ideally, I would like to create the following grouping column and yield the following results:

customer id sequence id prior sequence id grouping
1 1 0 1
1 2 1 1
1 3 2 1
2 4 0 2
2 5 4 2
2 6 0 3
2 7 6 3

I've attempted to utilize island gap logic utilizing the ROW_NUMBER() function. However, I have been unsuccessful in doing so. I suspect the need here is more along the lines of a recursive CTE, which I am attempting at the moment.

CodePudding user response:

I agree that a recursive CTE will do the job. Something like:

WITH reccte AS
( 
    /*query that determines starting point for recursion
     *
     * In this case we want all records with no prior_sequence_id
     */
    SELECT 
        customer_id, 
        sequence_id, 
        prior_sequence_id, 
        /*establish grouping*/
        ROW_NUMBER() OVER (ORDER BY sequence_id) as grouping
    FROM yourtable
    WHERE prior_sequence_id = 0
    UNION 

    /*join the recursive CTe back to the table and iterate*/
    SELECT
        yourtable.customer_id,
        yourtable.sequence_id,
        yourtable.prior_sequence_id,
        reccte.grouping
    FROM reccte
        INNER JOIN yourtable ON reccte.sequence_id = yourtable.prior_sequence_id
)
SELECT * FROM reccte;

CodePudding user response:

It looks like you could use a simple correlated query, at least given your sample data:

select *, (
  select Sum(Iif(prior_sequence_id = 0, 1, 0)) 
  from t t2 
  where t2.sequence_id <= t.sequence_id
) Grouping
from t;

See Example Fiddle

  • Related