I have a column with many zeros and I want to put only non-zero values into N groups of (approximately) equal sizes. All zero values should get group number 0. Is there an elegant way to do this?
My first approach was to use ntile_v2
from code below, but this way bin numbers dont't start with 1 and I don't get N bins.
with sample_data as
(select
rownum as id,
case when rownum <= 50 then 0 else rownum end as value
from xmltable('1 to 100')
)
select
id,
value,
ntile(50) over (order by value) as ntile_v1,
case when value = 0 then 0 else ntile(50) over (order by value) end as ntile_v2
from sample_data
order by id;
One solution would be to put non-zero values into separate CTE, compute the tiles there, and join back:
with sample_data as
(select
rownum as id,
case when rownum <= 50 then 0 else rownum end as value
from xmltable('1 to 100')
)
, sample_data_not_zero as
(select
id,
value,
ntile(50) over (order by value) as ntile_v3
from sample_data
where value <> 0
)
select
sd.id,
sd.value,
nvl(sdnz.ntile_v3, 0) as ntile_v3
from
sample_data sd
left outer join
sample_data_not_zero sdnz on sd.id = sdnz.id
order by id;
Unfortunately I have to compute this for many different columns and I'm wondering whether there is a shorter solution.
CodePudding user response:
You could partition the ntile
based on zero vs non-zero, in addition to your current case expression:
case
when value = 0 then 0
else ntile(50) over (partition by case when value = 0 then 0 else 1 end order by value)
end as ntile_v3
That seems to get the same result as your union
query.