I want to find the categories that make up a certain percentage of sales, and segment them according to their percentages in SQL. To do this, I must first sort them by their revenues in descending order and then select the top N percent. For example, if the total revenue is 20M:
Category Revenue
1 6.000.000
2 4.000.000
3 4.000.000
4 3.000.000
5 1.500.000
6 500.000
7 400.000
8 300.000
9 200.000
10 100.000
Total 20.000.000
-Categories that make up 70% (14M) of revenue - segment A
-Categories that make up 15% (3M) - segment B
-Categories that make up 10% (2M) - segment C
-Categories that make up 5% (1M) - segment D
So, the segments should be like this:
Category Segment
1 A
2 A
3 A
4 B
5 C
6 C
7 D
8 D
9 D
10 D
CodePudding user response:
I'm sure there's a simpler way of getting this result, but here's one [rather long] query that classifies the categories in segments according to your logic:
with
q as (
select *,
sum(revenue) over(order by revenue desc) as acc_revenue,
sum(revenue) over() as tot_revenue
from t
),
a as (
select * from q where acc_revenue <= 0.7 * tot_revenue
),
b as (
select *
from q
where acc_revenue - (select max(acc_revenue) from a) <= 0.15 * tot_revenue
and category not in (select category from a)
),
c as (
select *
from q
where acc_revenue - (select max(acc_revenue) from b) <= 0.10 * tot_revenue
and category not in (select category from a)
and category not in (select category from b)
),
d as (
select *
from q
where category not in (select category from a)
and category not in (select category from b)
and category not in (select category from c)
)
select *, 'A' as segment from a
union all select *, 'B' from b
union all select *, 'C' from c
union all select *, 'D' from d
Result:
category revenue acc_revenue tot_revenue segment
--------- -------- ------------ ------------ -------
1 6000000 6000000 20000000 A
2 4000000 14000000 20000000 A
3 4000000 14000000 20000000 A
4 3000000 17000000 20000000 B
5 1500000 18500000 20000000 C
6 500000 19000000 20000000 C
7 400000 19400000 20000000 D
8 300000 19700000 20000000 D
9 200000 19900000 20000000 D
10 100000 20000000 20000000 D
See running example at DB Fiddle.