This is how my data looks like:
Item | Year | Price |
---|---|---|
Hat | 2020 | 15 |
Hat | 2021 | 17 |
Hat | 2022 | 19 |
Pen | 2020 | 3 |
Pen | 2021 | 2 |
Pen | 2022 | 3.3 |
I want to use PostgreSQL to achieve a simple data analysis task that will provide the result below:
Item | Year | Price | Previous Year | Previous price | higher_than_previous_year |
---|---|---|---|---|---|
Hat | 2020 | 15 | |||
Hat | 2021 | 17 | 2020 | 15 | yes |
Hat | 2022 | 19 | 2021 | 17 | yes |
Pen | 2020 | 3 | |||
Pen | 2021 | 2 | 2020 | 3 | no |
Pen | 2022 | 3.3 | 2021 | 2 | yes |
Previous Year: Fetches the closest year that is lesser than the current year specified in the year column. Previous Price: Fetches the price for the item that corresponds with the previous year. higher_than_previous_year: compares the price column with the previous price column and computes yes when the price is higher than the previous price and no when the previous price is higher than the year price.
I will appreciate it if any PostgreSQL guru can take a crack at this. Thanks!
CodePudding user response:
You could use simple JOINS
and UNION
to achieve this (one of the many ways):
Query:
SELECT
a.item,
a.year,
a.price,
NULL AS prev_year,
NULL AS prev_price,
NULL AS higher_than_previous_year
FROM
(
Select *, ROW_NUMBER() OVER (PARTITION BY Item ORDER BY Year) as rn
from t
) as a
WHERE rn=1
UNION
SELECT
a.item,
a.year,
a.price,
b.year as prev_year,
b.price as prev_price,
CASE
WHEN a.price > b.price THEN 'yes'
ELSE 'No'
END AS higher_than_previous_year
FROM
(
Select *, ROW_NUMBER() OVER (PARTITION BY Item ORDER BY Year) as rn
from t
) as a
LEFT JOIN
(
Select *, ROW_NUMBER() OVER (PARTITION BY Item ORDER BY Year) as rn
from t
) as b
ON a.item=b.item and a.rn=b.rn 1
WHERE a.rn>1
Explanation:
- First creating a
ROW_NUM
column to find prev rows - Using the same table in all the places whereverFROM
andJOIN
is being used - Above
UNION
: Getting rows with 1 as they are quite easy to go ahead - Below
UNION
: Self joining the table to get it's prev values
See fiddle in db<>fiddle
If you have access, I would suggest to create intermediate tables
so that the query looks simpler. See this fiddle (created intermediate tables so query is easy to understand and debug)
CodePudding user response:
Just use the lag window function, which allows you to look back the previous row(s). Since this requires several look back operation, you define and use a WINDOW definition in the query. (see demo)
select item, year, price
, lag(year) over w previous_year
, lag(price) over w previous_price
, case when price > lag(price) over w then 'Yes'
when lag(price) over w is null then null::text
else 'No'
end higher_than_previous_year
from origin
window w as (partition by item order by year);