Home > Net >  Generate rows to fill in gaps between years, carry over a value from previous year
Generate rows to fill in gaps between years, carry over a value from previous year

Time:07-19

I have a table of road condition ratings (roads are rated from 1-20; 20 being good).

with road_inspections 
(road_id, year, cond) as (

select 1, 2009,   17 from dual union all
select 1, 2011,   16 from dual union all
select 1, 2015,   14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all

select 2, 2013, 17.5 from dual union all
select 2, 2016,   18 from dual union all
select 2, 2019,   18 from dual union all
select 2, 2022,   18 from dual union all

select 3, 2022,   20 from dual)


select * from road_inspections

   ROAD_ID       YEAR       COND
---------- ---------- ----------
         1       2009         17
         1       2011         16
         1       2015         14
         1       2016       18.3
         1       2019       18.1
         2       2013       17.5
         2       2016         18
         2       2019         18
         2       2022         18
         3       2022         20

db<>fiddle

In a query, for each road, I want to generate rows to fill in the gaps between the years.

  1. For a given road, starting at the first row (the earliest inspection), there should be consecutive rows for each year all the way to the current year (the sysdate year; currently 2022).
  2. For the filler rows, I want carry over the condition rating from the last known inspection.

The result would look like this:

   ROAD_ID       YEAR       COND
---------- ---------- ----------
         1       2009         17
         1       2010         17 *
         1       2011         16
         1       2012         16 *
         1       2013         16 *
         1       2014         16 *
         1       2015         14
         1       2016       18.3
         1       2017       18.3 *
         1       2018       18.3 *
         1       2019       18.1
         1       2020       18.1 *
         1       2021       18.1 *
         1       2022       18.1 *

         2       2013       17.5
         2       2014       17.5 *
         2       2015       17.5 *
         2       2016         18
         2       2017         18 *
         2       2018         18 *
         2       2019         18
         2       2020         18 *
         2       2021         18 *
         2       2022         18

         3       2022         20

*=filler row

Question:

How can I create those filler rows using an Oracle SQL query?

(My priorities are: simplicity first, performance second.)

CodePudding user response:

with
  road_inspections (road_id, year_, cond) as (
    select 1, 2009,   17 from dual union all
    select 1, 2011,   16 from dual union all
    select 1, 2015,   14 from dual union all
    select 1, 2016, 18.3 from dual union all
    select 1, 2019, 18.1 from dual union all
    select 2, 2013, 17.5 from dual union all
    select 2, 2016,   18 from dual union all
    select 2, 2019,   18 from dual union all
    select 2, 2022,   18 from dual union all
    select 3, 2022,   20 from dual
  )
, prep (road_id, first_year) as (
    select road_id, min(year_)
    from   road_inspections
    group  by road_id
  )
, all_years (road_id, year_) as (
    select p.road_id, l.year_
    from   prep p cross join lateral (
        select  first_year   level - 1 as year_
        from    dual
        connect by level <= 2022 - first_year   1
    ) l
  )
select road_id, year_,
       last_value(ri.cond ignore nulls) over 
           (partition by road_id order by year_) as cond
from   all_years ay left outer join road_inspections ri using (road_id, year_)
;

The first subquery, prep, finds the first year for each road id. This is used in the all_years subquery to generate all the years relevant for each road id.

Then left-outer-join to the original cata, copy the cond wherever it is available, and use the analytic function last_value with the ignore nulls option to fill in the gaps.

Note that I changed the column name year to year_ (with a trailing underscore); year is an Oracle keyword, not a good choice for a column name.

Output:

   ROAD_ID      YEAR_       COND
---------- ---------- ----------
         1       2009         17
         1       2010         17
         1       2011         16
         1       2012         16
         1       2013         16
         1       2014         16
         1       2015         14
         1       2016       18.3
         1       2017       18.3
         1       2018       18.3
         1       2019       18.1
         1       2020       18.1
         1       2021       18.1
         1       2022       18.1
         2       2013       17.5
         2       2014       17.5
         2       2015       17.5
         2       2016         18
         2       2017         18
         2       2018         18
         2       2019         18
         2       2020         18
         2       2021         18
         2       2022         18
         3       2022         20

CodePudding user response:

You can use the LEAD analytic function with a LATERAL joined hierarchical query to generate the missing rows from each row until the next row:

SELECT r.road_id,
       y.year,
       r.cond
FROM   ( SELECT r.*,
                LEAD(year, 1, EXTRACT(YEAR FROM SYSDATE)   1)
                  OVER (PARTITION BY road_id ORDER BY year) AS next_year
         FROM   road_inspections r
       ) r
       CROSS JOIN LATERAL (
         SELECT r.year   LEVEL - 1 AS year
         FROM   DUAL
         CONNECT BY r.year   LEVEL - 1 < r.next_year
       ) y

Which, for the sample data:

CREATE TABLE road_inspections (road_id, year, cond) as 
select 1, 2009,   17 from dual union all
select 1, 2011,   16 from dual union all
select 1, 2015,   14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016,   18 from dual union all
select 2, 2019,   18 from dual union all
select 2, 2022,   18 from dual union all
select 3, 2022,   20 from dual;

Outputs:

ROAD_ID YEAR COND
1 2009 17
1 2010 17
1 2011 16
1 2012 16
1 2013 16
1 2014 16
1 2015 14
1 2016 18.3
1 2017 18.3
1 2018 18.3
1 2019 18.1
1 2020 18.1
1 2021 18.1
1 2022 18.1
2 2013 17.5
2 2014 17.5
2 2015 17.5
2 2016 18
2 2017 18
2 2018 18
2 2019 18
2 2020 18
2 2021 18
2 2022 18
3 2022 20

db<>fiddle here

  • Related