Is it possible to write a window function that partitions every time when partition value changes?
For example:
create table test (id int, i int);
insert into test values (1, 1), (2, 0),(3, 0),(4, 1),(5, 1),(6, 1),(7, 0),(8, 0),(9, 1);
select
id,
i,
row_number() over (partition by i order by id asc) rn
from
test
order by
id asc;
will give me the following result:
id i rn
----------
1 1 1
2 0 1
3 0 2
4 1 2
5 1 3
6 1 4
7 0 3
8 0 4
9 1 5
However, I want to be able to restart the counter again when the value of i
is changed based on my sort order, regardless of whether it already appeared or not.
For example, the desired result would be this:
id i rn
----------
1 1 1
2 0 1
3 0 2
4 1 1
5 1 2
6 1 3
7 0 1
8 0 2
9 1 1
The same principle would apply for other window functions like cumulative sum for example.
I really don't want to loop just for this and I'm unable to find the right window function.
Thanks
CodePudding user response:
This problem is a gaps and islands problem. I derived my solution to this problem from this one:
Reset Row Number on value change, but with repeat values in partition
My query looks as such:
with cte as (
SELECT id, i,
ROW_NUMBER() OVER (PARTITION BY i order by id) rn1
FROM test
)
SELECT id, i,
ROW_NUMBER() OVER (PARTITION BY CAST((id - rn1) as CHAR(100)) || CAST(i as CHAR(100)) ORDER BY id) rn
FROM cte
ORDER BY
id
I extended your data set to validate the answer. I used the same concepts in Tim Biegeleisen's answer. I get the row number of i, and subtract that from an id. The difference of these should result in the same value for each id number with the same i, until i switches again. I ran into a problem at 8,0 with row_number of 4 (resulting in difference of 4) and 9,1 and row_number of 5 (also resulting a difference of 4). To account for this situation, I turn the difference into a string and concatenate I, making a unique id that we can partition over.
You can check it out at this dbfiddle: https://www.db-fiddle.com/f/hnGUTZGLvRQ5JoSCHozAhs/0
CodePudding user response:
You can achieve this by adding a column that contains the previous i
(using lag(i)
), then keep a running total of the number of times there is a "flip" (current i <> previous i). Rows with the same number of flips belong to the same group.
with u as
(select id, i, lag(i) over (order by id) as previousi
from test),
v as
(select id,
i,
sum(case when i = previousi then 0 else 1 end)
over (order by id rows unbounded preceding) as flips
from u)
select id, i, row_number() over (partition by flips order by id) as rn
from v