Its probably very easy, bot somehow I cannot get the desired result:
My data looks like this: I have a large table with items sold. Each item has a category assigned (here A-D) and country. I would like to calculate how many items were sold in Europe, in each category, and what is the share of this particular category to total sales
my data looks like this:
country | item_id | item_cat |
---|---|---|
Europe | 1 | A |
Europe | 2 | A |
Europe | 3 | B |
Europe | 4 | B |
Europe | 5 | C |
Europe | 6 | C |
Europe | 7 | C |
USA | 8 | D |
USA | 9 | D |
USA | 10 | D |
my desired output like this:
country | item_cat | cat_sales | total_sales | share |
---|---|---|---|---|
Europe | A | 2 | 7 | 0.29 |
Europe | B | 2 | 7 | 0.29 |
Europe | C | 3 | 7 | 0.43 |
what I tried is:
SELECT
country,
item_cat,
count(*) as cat_sales,
count(*) OVER () as total_sales,
cat_sales / total_sales as share
FROM data
where country='Europe'
group by item_cat
but SQL tells me I cannot group and use windowing in one request. How could i solve this? Thanks in advance
CodePudding user response:
A few ways, one would be to pre-count the total sales in a CTE and then select from it for the remaining aggregate.
I don't use impala however in standard SQL this should work
with tot as (
select *,
Count(*) over(partition by country) * 1.0 as total_sales
from t
)
select country, item_cat,
Count(*) as cat_sales,
total_sales,
Round(Count(*) / total_sales, 2) as Share
from tot
where country='europe'
group by country, item_cat, total_sales