Home > database >  How to join user with data from last 3 sessions in SQL?
How to join user with data from last 3 sessions in SQL?

Time:09-08

I have a table with user purchase information. Every user can buy at any day. Now, I want to create a new column that does a self-join but uses an aggregate of only the last 3 purchases of that user. How would I do this?

user | date       | purchase_amount
1    | 2020-01-01 | 10
1    | 2020-01-04 | 4
1    | 2020-01-05 | 1
1    | 2020-02-01 | 6
2    | ....

Now, I want to add a 4th column which takes the average of the last (up to) 3 purchases PRIOR to the current date.

user | date       | purchase_amount | past_3_purchases_avg
1    | 2020-01-01 | 10              | 0 (no prior purchase)
1    | 2020-01-04 | 4               | 10 (last and only purchase is $10)
1    | 2020-01-05 | 1               | 7  (last 2 purchases are $4 and $10)
1    | 2020-02-01 | 6               | 5  (last 3 purchases $1 and $4 and $10)
1    | 2020-02-04 | 3               | 3.6 (last 3 purchases $1 and $4 and 6$)
2    | ....

With window functions and lag function this takes too long given my dataset with multi-millions of rows. What is a good self join solution?

CodePudding user response:

I know you said windowed functions didn't work for you, but you also said you were using lead and lag, which are not really ideal.

This code should be faster than any self join:

SELECT 
    *, 
    AVG([purchase_amount]) 
    OVER(   PARTITION BY [userid]
            ORDER BY [date] DESC 
            ROWS BETWEEN 3 PRECEEDING 
            AND 1 PRECEEDING
    ) [Last3Avg]
FROM [data]

If that's not faster, I'd have a look at your indexing, cause I don't think a self join is ever going to beat that performance-wise if you've got a userid/date index that includes purchase_amount.

  • Related