Home > Net >  Get latest balance by day per user via SQL (Bigquery)
Get latest balance by day per user via SQL (Bigquery)

Time:08-31

I have a sample table below:

 --------------------- --------- -------- 
| timestamp           | id      | balance|
|--------------------- --------- --------|
| 2022-08-01 00:00:00 | 1       | 0.01   |
| 2022-08-01 00:00:00 | 2       | 0      | 
| 2022-08-02 00:00:00 | 2       | 0.005  |
| 2022-08-02 07:00:00 | 2       | 0.5    |
| 2022-08-02 07:00:00 | 1       | 0.15   |
| 2022-08-03 00:00:00 | 1       | 0.02   |
| 2021-08-03 01:00:00 | 1       | 0.03   |
 --------------------- --------- -------- 

And I want the following output:

 --------------------- --------- -------- 
| timestamp           | id      | balance|
|--------------------- --------- --------|
| 2022-08-01 00:00:00 | 1       | 0.01   |
| 2022-08-01 00:00:00 | 2       | 0      | 
| 2022-08-02 07:00:00 | 2       | 0.5    |
| 2022-08-02 07:00:00 | 1       | 0.15   |
| 2021-08-03 01:00:00 | 1       | 0.03   |
 --------------------- --------- -------- 

I want to get the latest balance for each id, per day. Ideally the sql would use a window function as the source data table is very large (~900GB) and I need something efficient that will scale.

I've spent a lot of time trying to figure this out and keep getting stuck, any help would be greatly appreciated! Thanks in advance

CodePudding user response:

Consider below simple approach

select * from your_table
qualify 1 = row_number() over id_day
window id_day as (partition by id, date(timestamp) order by timestamp desc)
  • Related