Suppose I have the following table:
my_table
a | b |
---|---|
a1 | b1 |
a1 | b1 |
a1 | b2 |
a2 | b1 |
a2 | b2 |
a2 | b2 |
a2 | b2 |
I would like a query that will return me for each type of b find the a with the max count of b type and its count. For example in the above table I would like the result will be:
b | a_with_max_b | MAX(count_of_b) |
---|---|---|
b1 | a1 | 2 |
b2 | a2 | 3 |
So far I arrive to this query:
SELECT b , MAX(count_of_b)
FROM
(SELECT a, b, COUNT(b) count_of_b
FROM my_table
GROUP BY a, b)
GROUP BY b;
This will return me the correct result but without the a.
b | MAX(count_of_b) |
---|---|
b1 | 2 |
b2 | 3 |
But if I am adding the a :
SELECT b, a a_with_max_b , MAX(count_of_b)
FROM
(SELECT a, b, COUNT(b) count_of_b
FROM my_table
GROUP BY a, b)
GROUP BY b;
I will receive just a random a and not the correct one. For example:
b | a_with_max_b | MAX(count_of_b) |
---|---|---|
b1 | a1 | 2 |
b2 | a1 | 3 |
What am I missing?
Thanks in advance, Elad
CodePudding user response:
SELECT a, b, COUNT(*) counts
FROM mytable
GROUP BY a, b
gives raw counts data, and reuse it as subquery
SELECT *
FROM (
SELECT a, b, COUNT(*) counts
FROM mytable
GROUP BY a, b
) m
WHERE ( b, counts ) IN (
SELECT b, MAX(counts)
FROM (
SELECT a, b, COUNT(*) counts
FROM mytable
GROUP BY a, b
) n
GROUP BY b
)
CodePudding user response:
This involves two steps:
- counting the rows per a and b
- finding the maximum count per b
and then showing only the a and b pairs with a maximum count per b of course.
select b, a as a_with_max_b, max_count_of_b
from
(
select
a, b,
count(*) as cnt,
max(count(*)) over (partition by b) as max_count_of_b
from my_table
group by a, b
)
where cnt = max_count_of_b
order by b, a;