Home > Back-end >  LAG function alternative. I need the results for the missing year in between
LAG function alternative. I need the results for the missing year in between

Time:12-10

I have this table so far. However, I would like to obtain the results for 2019 which there are no records so it becomes 0. Are there any alternatives to the LAG funciton.

ID Year Year_Count
1 2018 10
1 2020 20

Whenever I use the LAG function in SQL it gives me the results for 2018. However, I would like to get 0 for 2019 and then 10 for 2018

LAG(YEAR_COUNT) OVER (PARTITION BY ID ORDER BY YEAR) AS previous_year_count 

CodePudding user response:

untested notepad scribble

CASE
WHEN 1 = YEAR - LAG(YEAR) OVER (PARTITION BY ID ORDER BY YEAR) 
THEN LAG(YEAR_COUNT) OVER (PARTITION BY ID ORDER BY YEAR)
ELSE 0
END AS previous_year_count

CodePudding user response:

I'll add on to Nick's comment here with an example.

The YEARS CTE here is creating that table of years as he suggested, the RECORDS table is matching the above posted. Then they get joined together with COALESCE to fill in the null values left by the LEFT JOIN (filled ID with 0, not sure what your case would be).

You would need to LEFT JOIN onto the YEAR table and select the YEAR variable from the YEAR table in the final query, otherwise you'd only end up with only 2018/2020 or those years and some null values

WITH
  YEARS AS
(
  SELECT 2016 AS YEAR UNION ALL
  SELECT 2017 UNION ALL
  SELECT 2018 UNION ALL
  SELECT 2019 UNION ALL
  SELECT 2020 UNION ALL
  SELECT 2021  UNION ALL
  SELECT 2022   
)
,
  RECORDS AS
(
  SELECT 1 ID,  2018 YEAR,  10 YEAR_COUNT UNION ALL
  SELECT 1, 2020,   20)

SELECT 
 COALESCE(ID, 0) AS ID, 
 Y.YEAR, 
 COALESCE(YEAR_COUNT, 0) AS YEAR_COUNT
FROM YEARS AS Y
LEFT JOIN RECORDS AS R
ON R.YEAR = Y.YEAR

Here is the dbfiddle so you can visualize - https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=9e777ad925b09eb8ba299d610a78b999

Vertica SQL is not an available test environment, so this may not work directly but should at least get you on the right track.

The LAG function would not work to get 2019 for a few reasons

  • It's a window function and can only grab from data that is available - the default for LAG in your case appears to be 1 aka LAG(YEAR_COUNT, 1)
  • Statements in the select typically can't add any rows data back into a table, you would need to add in data with JOINs

If 2019 does exist in a prior table and you're using group by to get year count, it's possible that you have a where clause excluding the data.

  • Related