I have a table of road condition ratings (roads are rated from 1-20; 20 being good).
with road_inspections
(road_id, year, cond) as (
select 1, 2009, 17 from dual union all
select 1, 2011, 16 from dual union all
select 1, 2015, 14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016, 18 from dual union all
select 2, 2019, 18 from dual union all
select 2, 2022, 18 from dual union all
select 3, 2022, 20 from dual)
select * from road_inspections
ROAD_ID YEAR COND
---------- ---------- ----------
1 2009 17
1 2011 16
1 2015 14
1 2016 18.3
1 2019 18.1
2 2013 17.5
2 2016 18
2 2019 18
2 2022 18
3 2022 20
In a query, for each road, I want to generate rows to fill in the gaps between the years.
- For a given road, starting at the first row (the earliest inspection), there should be consecutive rows for each year all the way to the current year (the sysdate year; currently 2022).
- For the filler rows, I want carry over the condition rating from the last known inspection.
The result would look like this:
ROAD_ID YEAR COND
---------- ---------- ----------
1 2009 17
1 2010 17 *
1 2011 16
1 2012 16 *
1 2013 16 *
1 2014 16 *
1 2015 14
1 2016 18.3
1 2017 18.3 *
1 2018 18.3 *
1 2019 18.1
1 2020 18.1 *
1 2021 18.1 *
1 2022 18.1 *
2 2013 17.5
2 2014 17.5 *
2 2015 17.5 *
2 2016 18
2 2017 18 *
2 2018 18 *
2 2019 18
2 2020 18 *
2 2021 18 *
2 2022 18
3 2022 20
*=filler row
Question:
How can I create those filler rows using an Oracle SQL query?
(My priorities are: simplicity first, performance second.)
CodePudding user response:
with
road_inspections (road_id, year_, cond) as (
select 1, 2009, 17 from dual union all
select 1, 2011, 16 from dual union all
select 1, 2015, 14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016, 18 from dual union all
select 2, 2019, 18 from dual union all
select 2, 2022, 18 from dual union all
select 3, 2022, 20 from dual
)
, prep (road_id, first_year) as (
select road_id, min(year_)
from road_inspections
group by road_id
)
, all_years (road_id, year_) as (
select p.road_id, l.year_
from prep p cross join lateral (
select first_year level - 1 as year_
from dual
connect by level <= 2022 - first_year 1
) l
)
select road_id, year_,
last_value(ri.cond ignore nulls) over
(partition by road_id order by year_) as cond
from all_years ay left outer join road_inspections ri using (road_id, year_)
;
The first subquery, prep
, finds the first year for each road id. This is used in the all_years
subquery to generate all the years relevant for each road id.
Then left-outer-join to the original cata, copy the cond
wherever it is available, and use the analytic function last_value
with the ignore nulls
option to fill in the gaps.
Note that I changed the column name year
to year_
(with a trailing underscore); year
is an Oracle keyword, not a good choice for a column name.
Output:
ROAD_ID YEAR_ COND
---------- ---------- ----------
1 2009 17
1 2010 17
1 2011 16
1 2012 16
1 2013 16
1 2014 16
1 2015 14
1 2016 18.3
1 2017 18.3
1 2018 18.3
1 2019 18.1
1 2020 18.1
1 2021 18.1
1 2022 18.1
2 2013 17.5
2 2014 17.5
2 2015 17.5
2 2016 18
2 2017 18
2 2018 18
2 2019 18
2 2020 18
2 2021 18
2 2022 18
3 2022 20
CodePudding user response:
You can use the LEAD
analytic function with a LATERAL
joined hierarchical query to generate the missing rows from each row until the next row:
SELECT r.road_id,
y.year,
r.cond
FROM ( SELECT r.*,
LEAD(year, 1, EXTRACT(YEAR FROM SYSDATE) 1)
OVER (PARTITION BY road_id ORDER BY year) AS next_year
FROM road_inspections r
) r
CROSS JOIN LATERAL (
SELECT r.year LEVEL - 1 AS year
FROM DUAL
CONNECT BY r.year LEVEL - 1 < r.next_year
) y
Which, for the sample data:
CREATE TABLE road_inspections (road_id, year, cond) as
select 1, 2009, 17 from dual union all
select 1, 2011, 16 from dual union all
select 1, 2015, 14 from dual union all
select 1, 2016, 18.3 from dual union all
select 1, 2019, 18.1 from dual union all
select 2, 2013, 17.5 from dual union all
select 2, 2016, 18 from dual union all
select 2, 2019, 18 from dual union all
select 2, 2022, 18 from dual union all
select 3, 2022, 20 from dual;
Outputs:
ROAD_ID YEAR COND 1 2009 17 1 2010 17 1 2011 16 1 2012 16 1 2013 16 1 2014 16 1 2015 14 1 2016 18.3 1 2017 18.3 1 2018 18.3 1 2019 18.1 1 2020 18.1 1 2021 18.1 1 2022 18.1 2 2013 17.5 2 2014 17.5 2 2015 17.5 2 2016 18 2 2017 18 2 2018 18 2 2019 18 2 2020 18 2 2021 18 2 2022 18 3 2022 20
db<>fiddle here