Home > Software engineering >  Back-filling time-series data with previous time's values
Back-filling time-series data with previous time's values

Time:11-17

I have a table of time-series data that has some gaps in the series. An example of the data is below:

Date Value
2022-11-17 1
2022-11-14 2

I want to insert rows for the dates between the existing rows (2022-11-15, 2022-11-16) that have the value of the latest date before the date being inserted (the 2022-11-14 row).

I started by using an imperative solution in my application programming language but I'm convinced there must be a way to do this in SQL.

CodePudding user response:

demo:db<>fiddle

INSERT INTO mytable                               -- 5

SELECT
    generate_series(                              -- 1
        mydate   1,                               -- 2
        lead(mydate) OVER (ORDER BY mydate) - 1,  -- 3
        interval '1 day'                          
    )::date as gs,
    t.myvalue                                     -- 4
FROM mytable t;
  1. Use generate_series() to generate date series
  2. Start of your date series is the next day of the row's date value
  3. End of your date series is the day before the next row's date value. Here the lead() window function is used to access the next row
  4. Use the generated dates from the function and the value of the actual row for the newly generated rows.
  5. Finally insert them into your table.
  • Related