And what i want to do is to calculate the difference in terms of days between the dates. The first step that i'm trying to implement is to pull the previous rows and then calculate the difference.
Here's what i tried :
SELECT YEAR,MONTH,DAY,
lag(DATE) OVER w AS Lag
FROM DATASET
WINDOW w AS (PARTITION BY YEAR , MONTH , DAY ORDER BY YEAR , MONTH)
when i try to implement the lag i get this error:
check the manual that corresponds to your MySQL server version for the right syntax to use near 'Lag
Any help would be greatly appreciated , thank you
CodePudding user response:
To use LAG()
you need to specify at least an ordering criteria for the rows, as in lag(date) over(order by date)
.
For example:
select date - lag(date) over(order by date) as diff from dataset
This will compute a previous row by sorting the set by date
.
You can also add subgroups by including PARTITION BY
in the OVER
clause.
CodePudding user response:
You can create a CTE
(or a subquery) to put your LAG
function in, then perform a DATEDIFF
function to get the day difference between two dates.
CTE Version:
WITH CTE AS (SELECT YEAR, MONTH, DAY, DATE,
LAG(DATE) OVER (ORDER BY DATE) AS Lag_Date
FROM DATASET)
SELECT *,
COALESCE(DATEDIFF(DATE, Lag_Date), 0) AS Day_Diff
FROM CTE
Subquery Version:
SELECT *,
COALESCE(DATEDIFF(DATE, Lag_Date), 0) AS Day_Diff
FROM (SELECT YEAR, MONTH, DAY, DATE,
LAG(DATE) OVER (ORDER BY DATE) AS Lag_Date
FROM DATASET) a
Result:
YEAR | MONTH | DAY | DATE | Lag_Date | Day_Diff |
---|---|---|---|---|---|
2021 | 01 | 30 | 2021-01-30 | null | 0 |
2021 | 02 | 27 | 2021-02-27 | 2021-01-30 | 28 |
2021 | 12 | 31 | 2021-12-31 | 2021-02-27 | 307 |
2022 | 01 | 29 | 2022-01-29 | 2021-12-31 | 29 |
Fiddle here.