I have a table with a user "id", the "country" they are from, and the "year" they signed up, for example:
id | country | year |
---|---|---|
1 | USA | 2010 |
2 | Mexico | 2010 |
3 | USA | 2011 |
4 | India | 2011 |
5 | Japan | 2011 |
I am trying to calculate the running count of distinct countries by year. So the final output for the example above should be:
year | country_count |
---|---|
2010 | 2 |
2011 | 4 |
I wrote something along these lines but obviously this is flawed logic since the 2nd half of the query isn't a distinct count:
with t1 as (
select year, count(distinct country) country_count
from data
group by 1 order by 1
)
select *, sum(country_count) over (order by year) AS cumulative_country_count
from t1
CodePudding user response:
select distinct year
,count(country) over(order by year) as cnt
from (
select *
,row_number() over(partition by country order by year) as rn
from t
) t
where rn = 1
year | cnt |
---|---|
2010 | 2 |
2011 | 4 |