Best way to count a distinct row grouped by a different row?-CodePudding

DB is BigQuery, but it's very similar to most DB's, and I would imagine this answer will be universal.

The Table

user_id | date | category_id 
----------------------------
1       | xx   | 10
2       | xx   | 10
2       | xx   | 10
3       | xx   | 10
3       | xx   | 10
3       | xx   | 10
1       | xx   | 11
2       | xx   | 12

I would like to get a count of distinct user_id's per category_id

So the answer would produce :

category_id | distinct_user_count
---------------------------------
10          | 3
11          | 1
12          | 1

My apologies ahead of time if this has been asked before.

I should add that the obvious answer :

SELECT 
  category_id,
  count(distinct user_id)
FROM t
group by category_id

Produces unexpected results :

category_id | distinct_user_count
---------------------------------
10          | 1
11          | 1
12          | 1

CodePudding user response：

You can use this query

SELECT 
    category_id,
    COUNT(DISTINCT user_id) AS distinct_user_count
FROM t
GROUP BY category_id

CodePudding user response：

Here's a simple solution.

select category_id
      ,count(distinct user_id) as distinct_user_count
from t
group by category_id

category_id	distinct_user_count
10	3
11	1
12	1

Fiddle

CodePudding user response：

Try this:

select  distinct count(user_id) over(partition by category_id) as "users_per_id", 
  category_id
from the_table
group by category_id, user_id