I have a database that looks like this:
ID | Sale_Date(YYYY-MM-DD) | Total_Volume |
---|---|---|
123 | 2022-01-01 | 0 |
123 | 2022-01-02 | 2 |
123 | 2022-01-03 | 5 |
456 | 2022-04-06 | 38 |
456 | 2022-04-07 | 40 |
456 | 2022-04-08 | 45 |
I want to get a daily sale column from Total Volume. which is just by subtracting the total volume on date x with total volume on date x-1 for each id.
ID | Sale_Date(YYYY-MM-DD) | Total_Volume | Daily_Sale |
---|---|---|---|
123 | 2022-01-01 | 0 | 0 |
123 | 2022-01-02 | 2 | 2 |
123 | 2022-01-03 | 5 | 3 |
456 | 2022-04-06 | 38 | 38 |
456 | 2022-04-07 | 40 | 2 |
456 | 2022-04-08 | 45 | 5 |
My initial attempt was using a rank function and self join but that didnt turn out correct.
with x as (
select
distinct t1.ID,
t1.Sale_Date,
t1.Total_volume,
rank() over (partition by ID order by Sale_Date) as ranker
from t t1 order by t1.Sale_Date)
select t2.ID, t2.ranker, t2.Sale_date, t1.Total_volume, t1.Total_volume - t2.Total_volume as Daily_sale
from x t1, x t2 where t1.ID = t2.ID and t2.ranker = t1.ranker-1 order by t1.ID;
CodePudding user response:
You should use:
- the
LAG
window function to retrieve last "Sale_Date" value - the
COALESCE
function to replace NULL with "Total Volume" for each first rows
Then subtract Total_Volume
from the previous value of Total_Volume
and coalesce if the value of the LAG
is NULL.
SELECT *,
COALESCE(`Total_Volume`
-LAG(`Total_Volume`) OVER(PARTITION BY `ID`
ORDER BY `Sale_Date(YYYY-MM-DD)`), `Total_Volume`) AS `Daily_Sale`
FROM tab
Check the demo here.