Home > Enterprise >  Get minimum per row based on value from another column
Get minimum per row based on value from another column

Time:01-31

Below is the base table of data..

date customer_id score1 score2
01/01/22 a 1 1
02/01/22 a 1 1
01/01/22 b 2 2
02/01/22 b 4 1
01/01/22 c 1 1
02/01/22 c 1 4
01/01/22 d 5 1
02/01/22 d 10 1

This is the result that I want to achieve, where I only pull through the rows where there has been a change in either score1 or score2 from the previous date. In this case, b has gone from 2 to 4 on score1 and 2 to 1 on score2, c has gone from 1 to 4 on score 2, d has gone from 5 to 10 on score 1.

date customer_id score1 score2
02/01/22 b 4 1
02/01/22 c 1 4
02/01/22 d 10 1

Unsure if there is a function to do this. Altenatively, would it be best to have two separate tables initially and use a join to achieve this. Using SQL presto if that helps.

Many thanks!

CodePudding user response:

We can use the LAG() window function here:

WITH cte AS (
    SELECT t.*, LAG(score, 1, score1) OVER (PARTITION BY customer_id
                                            ORDER BY date) AS lag_score_1,
                LAG(score, 1, score2) OVER (PARTITION BY customer_id
                                            ORDER BY date) AS lag_score_2
    FROM yourTable t
)

SELECT date, customer_id, score
FROM cte
WHERE score1 <> lag_score_1 OR score2 <> lag_score_2
ORDER BY date;

This answer uses the 3 parameter version of LAG(). The second parameter specifies a step of 1, while the third parameter specifies a default value in case there is no previous value. In this case, we use score as the default such that the earliest record in each customer partition is ignored.

  • Related