Home > Software engineering >  PostgreSQL recursively lookup value
PostgreSQL recursively lookup value

Time:10-18

I have the following changes table to store the changes of each ID, I would like to recursively lookup the latest changed ID of each ID.

Changes table:

old_id new_id
1 2
2 3
3 4
999 1000

Expected result:

original_id latest_id
1 4
2 4
3 4
999 1000

SQL:

with recursive r as (
   select * from changes
   union
   select t1.old_id, t2.new_id
   from r t1 left join changes t2
   on t1.new_id=t2.old_id
), changes as (
   select * from (values(1,2)) as t(old_id,new_id)
   union
   select * from (values(2,3)) as t(old_id,new_id)
   union
   select * from (values(3,4)) as t(old_id,new_id)
   union
   select * from (values(999,1000)) as t(old_id,new_id)
)
select * from r

CodePudding user response:

Maybe I'm overthinking it, I'm not sure, but the solution below works for your test data and is somewhat understandable (to me at least).

The idea behind it is to add a root ID to all ID pairs using a recursive query. Then we use window functions partitioned by roots to find out the latest ID.

with recursive
  --
  -- Test data
  --
  changes(old_id, new_id) as (values
    (1,2),
    (2,3),
    (3,4),
    (999,1000)
  ),

  --
  -- Obtain all root IDs, i.e., all `old_id` values that don't appear in the
  -- `new_id` column.
  --
  roots(root_id) as (
    select old_id from changes where old_id not in (select new_id from changes)
  ),

  --
  -- Add a root ID to all pairs.
  --
  rooted_changes(root_id, old_id, new_id) as (
    select old_id, old_id, new_id
    from changes
    join roots on changes.old_id = roots.root_id

    union all

    select older.root_id, newer.old_id, newer.new_id
    from rooted_changes older
    join changes newer on older.new_id = newer.old_id
  ),

  --
  -- Partition each row over the root ID that it belongs to and extract
  -- the max `new_id` value (alternatives in terms of `first_value` and
  -- `last_value` are given).
  --
  result as (
    select
      old_id,
      max(new_id) over roots,

      -- alternatively
      first_value(new_id) over (roots order by new_id desc),

      -- alternatively
      last_value(new_id) over (roots
        order by new_id asc
        rows between current row and unbounded following
      )
    from rooted_changes
    window roots as (partition by root_id)
  )

-- Uncomment selectively to debug various stages:

-- select * from roots

-- select * from rooted_changes order by root_id asc

select * from result order by old_id asc;

Final result:

┌────────┬──────┬─────────────┬────────────┐
│ old_id │ max  │ first_value │ last_value │
├────────┼──────┼─────────────┼────────────┤
│      1 │    4 │           4 │          4 │
│      2 │    4 │           4 │          4 │
│      3 │    4 │           4 │          4 │
│    999 │ 1000 │        1000 │       1000 │
└────────┴──────┴─────────────┴────────────┘

Intermediate result of select * from roots:

┌─────────┐
│ root_id │
├─────────┤
│       1 │
│     999 │
└─────────┘

Intermediate result of select * from rooted_changes order by root_id asc:

┌─────────┬────────┬────────┐
│ root_id │ old_id │ new_id │
├─────────┼────────┼────────┤
│       1 │      1 │      2 │
│       1 │      2 │      3 │
│       1 │      3 │      4 │
│     999 │    999 │   1000 │
└─────────┴────────┴────────┘
  • Related