Home > Software engineering >  SQL calculate share of grouped variables to total count
SQL calculate share of grouped variables to total count

Time:10-10

Its probably very easy, bot somehow I cannot get the desired result:

My data looks like this: I have a large table with items sold. Each item has a category assigned (here A-D) and country. I would like to calculate how many items were sold in Europe, in each category, and what is the share of this particular category to total sales

my data looks like this:

country item_id item_cat
Europe 1 A
Europe 2 A
Europe 3 B
Europe 4 B
Europe 5 C
Europe 6 C
Europe 7 C
USA 8 D
USA 9 D
USA 10 D

my desired output like this:

country item_cat cat_sales total_sales share
Europe A 2 7 0.29
Europe B 2 7 0.29
Europe C 3 7 0.43

what I tried is:

SELECT 
   country, 
   item_cat, 
   count(*) as cat_sales,
   count(*) OVER () as total_sales,
   cat_sales / total_sales as share
FROM data
where country='Europe'
group by item_cat

but SQL tells me I cannot group and use windowing in one request. How could i solve this? Thanks in advance

CodePudding user response:

A few ways, one would be to pre-count the total sales in a CTE and then select from it for the remaining aggregate.

I don't use impala however in standard SQL this should work

with tot as (
    select *, 
      Count(*) over(partition by country) * 1.0 as total_sales
    from t
)
select country, item_cat, 
    Count(*) as cat_sales, 
    total_sales,
    Round(Count(*) / total_sales, 2) as Share
from tot
where country='europe'
group by country, item_cat, total_sales
  • Related