Get minimum per row based on value from another column-CodePudding

Below is the base table of data..

date	customer_id	score1	score2
01/01/22	a	1	1
02/01/22	a	1	1
01/01/22	b	2	2
02/01/22	b	4	1
01/01/22	c	1	1
02/01/22	c	1	4
01/01/22	d	5	1
02/01/22	d	10	1

This is the result that I want to achieve, where I only pull through the rows where there has been a change in either score1 or score2 from the previous date. In this case, b has gone from 2 to 4 on score1 and 2 to 1 on score2, c has gone from 1 to 4 on score 2, d has gone from 5 to 10 on score 1.

date	customer_id	score1	score2
02/01/22	b	4	1
02/01/22	c	1	4
02/01/22	d	10	1

Unsure if there is a function to do this. Altenatively, would it be best to have two separate tables initially and use a join to achieve this. Using SQL presto if that helps.

Many thanks!

CodePudding user response：

We can use the LAG() window function here:

WITH cte AS (
    SELECT t.*, LAG(score, 1, score1) OVER (PARTITION BY customer_id
                                            ORDER BY date) AS lag_score_1,
                LAG(score, 1, score2) OVER (PARTITION BY customer_id
                                            ORDER BY date) AS lag_score_2
    FROM yourTable t
)

SELECT date, customer_id, score
FROM cte
WHERE score1 <> lag_score_1 OR score2 <> lag_score_2
ORDER BY date;

This answer uses the 3 parameter version of LAG(). The second parameter specifies a step of 1, while the third parameter specifies a default value in case there is no previous value. In this case, we use score as the default such that the earliest record in each customer partition is ignored.