Home > Net >  SQL Query - calculate pct change between latest two years
SQL Query - calculate pct change between latest two years

Time:10-20

I have a table in mysql database that has the following structure and data.

City    zip    year   value 

AB, NM  87102  2012   150
AB, NM  87102  2013   175
AB, NM  87102  2014   200
DL, TX  75212  2018   100
DL, TX  75212  2019   150
DL, TX  75212  2020   175 
AT, TX  83621  2020   150

I am trying to group by city, and zip fields and calculate % change between latest two available years for the group. Note, the latest two available years may not be consecutive and not all groups may have two years of data.

Expected output:

City     zip     pct_change

AB, NM   87102   14.3
DL, TX   75212   16.6

Query:

select City, zip, max(year), calculate diff between value
from table
group by City, zip
where ....

CodePudding user response:

lag(value) over(partition by City, zip order by year)

will return the value from the preceding row with the same City and zip values ordered by the year column.

Use a CTE and row_number() to limit the result to the latest row per group.

Now you just need some "simple" math:

with cte as (
  select 
    City,
    zip,
    value,
    lag(value) over(partition by City, zip order by year) as lvalue,
    row_number() over(partition by City, zip order by year desc) as rn  
  from tbl
)
select City, zip, (value/lvalue - 1) * 100 as avg_pct_change
from cte
where rn = 1 and lvalue is not null

CodePudding user response:

You may use ROW_NUMBER() window function to get the values for the last two years as the following:

SELECT City, Zip, 
      ( 
       MAX(CASE WHEN rn =1 THEN value END)/ 
       MAX(CASE WHEN rn =2 THEN value END) - 1
      ) * 100 pct_change
FROM
(
  SELECT *, ROW_NUMBER() OVER (PARTITION BY City, Zip ORDER BY year DESC) rn
  FROM Table_Name
) T
WHERE rn <= 2 
GROUP BY City, Zip
HAVING MAX(CASE WHEN rn =2 THEN value END) IS NOT null
ORDER BY Zip

MAX(CASE WHEN rn =1 THEN value END) will find the value for the last year.

MAX(CASE WHEN rn =2 THEN value END) will find the value for before the last year.

WHERE rn <= 2 only last two years are included for each group.

HAVING MAX(CASE WHEN rn =2 THEN value END) IS NOT null groups should have more than one row.

See a demo.

  • Related