Home > Software design >  Fill null value by previous value and group by Postgresql
Fill null value by previous value and group by Postgresql

Time:01-20

I have an table and I want to fill the null value with previous value order by date but there have an group too For example: Table X:

Date Group value
1/1/2023 A null
2/1/2023 A Kevin
3/1/2023 A null
4/1/2023 A Tom
5/1/2023 A null
1/1/2023 B Sara
2/1/2023 B null

So I want to group by Group column and fill the null value of value column, The group can be multi value and the date is unique per group. I want the result like this:

Date Group value
1/1/2023 A null
2/1/2023 A Kevin
3/1/2023 A Kevin
4/1/2023 A Tom
5/1/2023 A Tom
1/1/2023 B Sara
2/1/2023 B Sara

How can I do it in postgresql ? Please help me

I have tried and I realy don't know how to do it. I just a newbie too

CodePudding user response:

If the input data is always in this form, we can use GREATEST and LAG:

SELECT
xdate, 
xgroup,
GREATEST(xvalue, LAG(xvalue) OVER()) AS xvalue
FROM X
ORDER BY xgroup, xdate;

Try out here with your sample data: db<>fiddle

GREATEST fetches the highest of two (or more) values which is NOT NULL, LAG selects the value from the previous row.

If this is not sufficient in your scenario due to possible more complex input data, please edit your question to add further cases which should be covered.

In this answer, the columns were renamed by adding a x because the original names are SQL keywords and should be avoided if possible.

CodePudding user response:

You should use LAG window function.

CodePudding user response:

If you can have more than one NULL values consecutively, LAG function won't help you much. A generalized solution would use:

  • the COUNT window function to generate a partitioning of one non-null value and consecutive null values
  • the MAX window functions to reassign NULL values.
WITH cte AS (
    SELECT *,
           COUNT(CASE WHEN value_ IS NOT NULL THEN 1 END) OVER(
               PARTITION BY Group_ 
               ORDER     BY Date_
           ) AS rn
    FROM tab
)
SELECT Date_, Group_, MAX(value_) OVER(PARTITION BY group_, rn) AS value_
FROM cte
ORDER BY group_, Date_

Check the demo here.

  • Related