Home > Software engineering >  Oracle SQL: How to fill Null value with data from most recent previous date that is not null?
Oracle SQL: How to fill Null value with data from most recent previous date that is not null?

Time:11-22

Essentially date field is updated every month along with other fields, however one field is only updated ~6 times throughout the year. For months where that field is not updated, looking to show the most recent previous data

Date Emp_no Sales Group
Jan 1234 100 Med
Feb 1234 200 ---
Mar 1234 170 ---
Apr 1234 150 Low
May 1234 180 ---
Jun 1234 90 High
Jul 1234 100 ---

Need it to show:

Date Emp_no Sales Group
Jan 1234 100 Med
Feb 1234 200 Med
Mar 1234 170 Med
Apr 1234 150 Low
May 1234 180 Low
Jun 1234 90 High
Jul 1234 100 High

This field is not updated at set intervals, could be 1-4 months of Nulls in a row

Tried something like this to get the second most recent date but unsure how to deal with the fact that i could need between 1-4 months prior

 LAG(Group)
    OVER(PARTITION BY emp_no
    ORDER BY date)

Thanks!

CodePudding user response:

This is the traditional "gaps and islands" problem.

There are various ways to solve it, a simple version will work for you.

First, create a new identifier that splits the rows in to "groups", where only the first row in each group is NOT NULL.

SUM(CASE WHEN "group" IS NOT NULL THEN 1 ELSE 0 END) OVER (PARTION BY emp_no ORDER BY "date") AS emp_group_id

Then you can use MAX() in another window function, as all "groups" will only have one NOT NULL value.

WITH
  gaps
AS
(
  SELECT
    t.*,
    SUM(
      CASE WHEN "group" IS NOT NULL
           THEN 1
           ELSE 0
      END
    )
      OVER (
        PARTITION BY emp_no
            ORDER BY "date"
      )
        AS emp_group_id
  FROM
    your_table t
)
SELECT
  "date",
  emp_no,
  sales,
  MAX("group")
    OVER (
      PARTITION BY emp_no, emp_group_id
    )
      AS "group"
FROM
  gaps

Edit

Ignore all that.

Oracle has IGNORE NULLS.

LAST_VALUE("group" IGNORE NULLS)
  OVER (
    PARTITION BY emp_no
        ORDER BY "date"
    ROWS BETWEEN UNBOUNDED PRECEDING
             AND CURRENT ROW
  )
    AS "group"
  • Related