Postgresql - Update Query - How To Set Column Value To Be The Name Of The Last Column That Is Not nu-CodePudding

I have some data similar to that listed below:

      | id | status   | stage1 | stage2 | stage3 | stage4 | stage5 | rejected_at_stage
----------------------------------------------------------------------------------------------
row 1 | 1  | rejected | 125    |   null |  null  |  null  |  null  |    stage1
row 2 | 2  | rejected | 125    |   123  |  null  |  null  |  null  |    stage2
row 3 | 3  | rejected | 125    |   123  |  154   |  null  |  null  |    stage3
row 4 | 4  | rejected | 125    |   123  |  154   |  132   |  null  |    stage4
row 5 | 5  | rejected | 125    |   123  |  154   |  132   |  143   |    stage5

The column rejected_at_stage is a new column that is to be backfilled to set the name of the stages column (stage1...stage5) that is the last instance (left to right) that is not null. For example, in row 2 the rejected_at_stage value is stage2. This is because stage1 is not null and stage2 is not null. However, stage3 has a null value. So the rejected_at_stage column = last non null value from stages columns (left to right) => stage 2.

Is it possible to do the following in postgresql to update the rejected_at_stage column for each row in a select statement?

find the last column that is not null from the given set of columns (e.g.[stage1, stage2...., stage5]
Update the rejected_at_stage column with the name of the column that was found

CodePudding user response：

It’s possible, but this doesn't look like great design. What happens when a new stage is added? I suspect your issues will disappear if you make a separate "Stage" table with two columns: Stage_Number and Stage_Value. Then finding the last filled in stage is a simple MAX query (and the rejection comes from adding one more to that value).

When you hard-wire the stages, you are left with a rather clumsy construct like

CASE WHEN stage1 IS NULL THEN 'stage1'
ELSE WHEN stage2 IS NULL THEN 'stage2'
etc.
END CASE

which, as I say, is possible, but inelegant and inflexible.

CodePudding user response：

You can convert the columns to a JSON structure, then unnest the keys, sort them and pick the first one:

select t.*, 
       (select x.col
        from jsonb_each_text(jsonb_strip_nulls(to_jsonb(t) - 'id' - 'status')) as x(col, val)
        order by x.col desc
        limit 1) as final_stage
from the_table t

Note this won't work correctly if you have more than 9 columns as sorting by name will put stage10 before stage2 so with the above descending sort stage2 will be returned. If you need that, you need to convert the name to a proper number that is sorted correctly:

order by regexp_replace(x.col, '[^0-9]', '', 'g')::int desc