I have data like this in SQL Server 2016 database table:
PERIODE | PERIODE_FORECAST | VALUE |
---|---|---|
2021-08-01 | 2021-01-01 | 51384.673 |
2021-08-01 | 2021-02-01 | 44118.129 |
2021-08-01 | 2021-03-01 | 43164.446 |
2021-08-01 | 2021-04-01 | 38113.745 |
2021-08-01 | 2021-05-01 | 37306.956 |
2021-08-01 | 2021-06-01 | 38390.359 |
2021-08-01 | 2021-07-01 | 42692.390 |
2021-08-01 | 2021-08-01 | 39814.047 |
2021-08-01 | 2021-09-01 | 0.000 |
2021-08-01 | 2021-10-01 | 0.000 |
2021-08-01 | 2021-11-01 | 0.000 |
2021-08-01 | 2021-12-01 | 0.000 |
I have a condition to fill out the zero values for example :
CASE
WHEN PERIODE_FORECAST > PERIODE
THEN [PREVIOUS_MONTH] * 0.1
ELSE VALUE
END
I am trying to use the LAG()
function, but it didn't come up as I was hoped for.
Query :
SELECT
PERIODE,PERIODE_FORECAST,
CASE
WHEN PERIODE_FORECAST > PERIODE
THEN LAG(VALUE, 1, 0) OVER (ORDER BY PERIODE_FORECAST ASC) * 0.1
ELSE VALUE
END VALUE
FROM
MyTable
Result :
PERIODE | PERIODE_FORECAST | VALUE |
---|---|---|
2021-08-01 | 2021-01-01 | 51384.673 |
2021-08-01 | 2021-02-01 | 44118.129 |
2021-08-01 | 2021-03-01 | 43164.446 |
2021-08-01 | 2021-04-01 | 38113.745 |
2021-08-01 | 2021-05-01 | 37306.956 |
2021-08-01 | 2021-06-01 | 38390.359 |
2021-08-01 | 2021-07-01 | 42692.390 |
2021-08-01 | 2021-08-01 | 39814.047 |
2021-08-01 | 2021-09-01 | 3981.4047 |
2021-08-01 | 2021-10-01 | 0.000 |
2021-08-01 | 2021-11-01 | 0.000 |
2021-08-01 | 2021-12-01 | 0.000 |
Expected results:
| PERIODE | PERIODE_FORECAST | VALUE |
| --------------|-----------------------|-----------|
| 2021-08-01 | 2021-01-01 | 51384.673 |
| 2021-08-01 | 2021-02-01 | 44118.129 |
| 2021-08-01 | 2021-03-01 | 43164.446 |
| 2021-08-01 | 2021-04-01 | 38113.745 |
| 2021-08-01 | 2021-05-01 | 37306.956 |
| 2021-08-01 | 2021-06-01 | 38390.359 |
| 2021-08-01 | 2021-07-01 | 42692.390 |
| 2021-08-01 | 2021-08-01 | 39814.047 |
| 2021-08-01 | 2021-09-01 | 3981.4047 |
| 2021-08-01 | 2021-10-01 | 398.14047 |
| 2021-08-01 | 2021-11-01 | 39.814047 |
| 2021-08-01 | 2021-12-01 | 3.9814047 |
|---------------|-----------------------|-----------|
Is there any solution or workaround without creating stored procedure?
Thanks
CodePudding user response:
Your problem is that you lag the value 0 when you are 2 month after the first periode. I think that you have to use a parameter for the second part of the function lag(). Someting like : lag(VALUE, DATEDIFF(m,convert(date,'2021-08-01'),convert(date,'2021-10-01')) ,0 )
here :
Lag( Value, DATEDIFF(m,convert(date,PERIODE),convert(date,PERIODE_FORECAST)) ,0 )
You maybe have to add this datediff as exponent for the multiplicator 0.1.
CodePudding user response:
This was an interesting question, and my approach is likely not the most performant since I am a programmer, but it does produce the results you expect. I first create a common table expression and number the rows with ROW_NUMBER. Then I join back to the CTE to find the single VALUE with the highest ROW_NUMBER. You can this use the difference in ROW_NUMBERs as a POWER of 10.0000 to get the multiplier.
First I created your data:
DECLARE @Forecast AS TABLE
(
[PERIODE] DATE NOT NULL,
[PERIODE_FORECAST] DATE NOT NULL,
[VALUE] DECIMAL(16, 8) NOT NULL
);
INSERT INTO @Forecast
(
[PERIODE],
[PERIODE_FORECAST],
[VALUE]
)
VALUES
('2021-08-01', '2021-01-01', 51384.673),
('2021-08-01', '2021-02-01', 44118.129),
('2021-08-01', '2021-03-01', 43164.446),
('2021-08-01', '2021-04-01', 38113.745),
('2021-08-01', '2021-05-01', 37306.956),
('2021-08-01', '2021-06-01', 38390.359),
('2021-08-01', '2021-07-01', 42692.390),
('2021-08-01', '2021-08-01', 39814.047),
('2021-08-01', '2021-09-01', 0.000),
('2021-08-01', '2021-10-01', 0.000),
('2021-08-01', '2021-11-01', 0.000),
('2021-08-01', '2021-12-01', 0.000);
Once I had this, the query is:
;WITH [NumberedRows]
AS (SELECT [PERIODE],
[PERIODE_FORECAST],
[VALUE],
ROW_NUMBER() OVER (PARTITION BY [PERIODE]
ORDER BY [PERIODE_FORECAST]
) AS [rn]
FROM @Forecast)
SELECT [nr1].[PERIODE],
[nr1].[PERIODE_FORECAST],
CASE WHEN [sub].[VALUE] IS NOT NULL THEN
[sub].[VALUE] * POWER(10.0000000, ([sub].[rn] - [nr1].[rn]))
ELSE
[nr1].[VALUE]
END AS [VALUE]
FROM [NumberedRows] AS [nr1]
LEFT OUTER JOIN
(
SELECT TOP(1)
[nr2].[PERIODE], [nr2].[PERIODE_FORECAST], [nr2].[VALUE], [nr2].[rn]
FROM [NumberedRows] [nr2]
WHERE [nr2].[VALUE] <> 0
ORDER BY [nr2].[rn] DESC
) AS [sub]
ON [sub].[PERIODE] = [nr1].[PERIODE]
AND [sub].[rn] < [nr1].[rn];
And the results:
PERIODE | PERIODE_FORECAST | VALUE |
---|---|---|
2021-08-01 | 2021-01-01 | 51384.673000 |
2021-08-01 | 2021-02-01 | 44118.129000 |
2021-08-01 | 2021-03-01 | 43164.446000 |
2021-08-01 | 2021-04-01 | 38113.745000 |
2021-08-01 | 2021-05-01 | 37306.956000 |
2021-08-01 | 2021-06-01 | 38390.359000 |
2021-08-01 | 2021-07-01 | 42692.390000 |
2021-08-01 | 2021-08-01 | 39814.047000 |
2021-08-01 | 2021-09-01 | 3981.404700 |
2021-08-01 | 2021-10-01 | 398.140470 |
2021-08-01 | 2021-11-01 | 39.814047 |
2021-08-01 | 2021-12-01 | 3.981405 |
CodePudding user response:
This is gaps and island problem where each non-zero value marks the beginning of a new island. Once grouped you can use first_value
to pick out the corresponding "previous value" while row_number
serves as an offset for computing a multiplier.
with A as (
select *,
sum(case when "value" <> 0 then 1 else 0 end)
over (order by periode_forecast) as grp
from T
), B as (
select *,
first_value("value")
over (partition by grp order by periode_forecast) as pv,
power(10e, 1 - row_number()
over (partition by grp order by periode_forecast)) as mult
from A
)
select periode_forecast, "value", grp, pv, pv * mult as new_value
from B
order by periode_forecast;
You may want to stick with decimal
math and avoid float
. If so then adjust the 10e
inside the power()
reference.
If there is no valid prior non-zero row then the result will be zero. It's not clear whether that will happen in your data or how to treat it differently.
https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=545d6e84ea43885788e45a2fb0393884