Is there a way to calculate query where a row in a new column takes on the next value which is higher than itself in a different column. I have tried using a rolling max method, but I have hit a wall because it involves specifying a specific number of rows to compare like so:
select t.*,
max(val) over (order by date rows between 2 preceding and current row) max3
from t;
Where we need to specify that the max will be calculated between the current row and 2 rows prior.
However, what I would like to do is something like this:
---------------------------------------
val | id | newcol |
2 | ABC | 4 |
2 | ABC | 4 |
2 | ABC | 4 |
4 | ABC | 10 |
4 | ABC | 10 |
10 | ABC | 9999 |
10 | ABC | 9999 |
3 | EFG | 5 |
5 | EFG | 9 |
9 | EFG | 9999 |
2 | HIJ | 9999 |
7 | KLM | 11 |
11 | KLM | 12 |
12 | KLM | 9999 |
Where I partition by the id and the new col is assigned the next value in the val col which is higher than its own value (regardless of how many rows ahead it is) and assigns an arbitrary large number (9999) if it is already the highest value in its partition.
So far I have come up with this:
SELECT val,
id,
CASE WHEN val = MAX(val) OVER(PARTITION BY id) THEN 9999
ELSE MAX(val) OVER(partition by id ORDER BY val ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
END AS newcol
FROM table
ORDER BY ID;
However this just assigns the maximum value of the partition to any value which is not the maximum like this:
---------------------------------------
val | id | newcol |
2 | ABC | 10 |
2 | ABC | 10 |
2 | ABC | 10 |
4 | ABC | 10 |
4 | ABC | 10 |
10 | ABC | 9999 |
10 | ABC | 9999 |
3 | EFG | 9 |
5 | EFG | 9 |
9 | EFG | 9999 |
2 | HIJ | 9999 |
7 | KLM | 11 |
11 | KLM | 11 |
12 | KLM | 9999 |
Any help would be much appreciated.
CodePudding user response:
You need a range clause in order to look at distinct values. Order by value and get the minimum value of all following (higher) values.
coalesce(min(region_kz) over (partition by id
order by val
range between 1 following and unbounded following),
9999) as newcol
CodePudding user response:
In SQL Server, I could:
with T as (
select *
from (values
('2022-01-01',2,'ABC',4)
,('2022-01-02',2,'ABC',4)
,('2022-01-03',2,'ABC',4)
,('2022-01-04',4,'ABC',10)
,('2022-01-05',4,'ABC',10)
,('2022-01-06',10,'ABC',9999)
,('2022-01-07',10,'ABC',9999)
,('2022-01-08',3,'EFG',5)
,('2022-01-09',5,'EFG',9)
,('2022-01-10',9,'EFG',9999)
,('2022-01-11',2,'HIJ',9999)
,('2022-01-12',7,'KLM',11)
,('2022-01-13',11,'KLM',12)
,('2022-01-14',12,'KLM',9999)
) T (Date, val, id, Expected)
),
TDR as (
select
t.*
, DR=DENSE_RANK() over (partition by id order by val asc)
from t
)
select
TDRT.*,
coalesce(TDRN.val,9999) as NextHigherVal
from
TDR as TDRT
left join
(select id, DR, val
from TDR
group by id, dr, val
) as TDRN
on TDRT.id=TDRN.id
and TDRT.DR 1=TDRN.DR
By no means the only solution.