How to get the most repeated value of x column grouped by z column

I have a select that returns a table such as:

weekOfTheYear	mostRepeatedID
01	a
01	b
01	a
02	b
02	b
02	a

and what I need is:

weekOfTheYear	mostRepeatedID
01	a
02	b

so that each week of the year only appears once and the mostRepeatedID for each week, is the value that appears the most.

CodePudding user response：

You can use the DENSE_RANK analytic function to find the rows with the maximum count and then filter to only return the rows with the first rank:

SELECT weekOfTheYear,
       mostRepeatedId
FROM   table_name
GROUP BY 
       weekOfTheYear,
       mostRepeatedId
ORDER BY
       DENSE_RANK() OVER (
         PARTITION BY weekOfTheYear
         ORDER BY COUNT(*) DESC
       )
FETCH FIRST ROW WITH TIES;

Which, for the sample data:

CREATE TABLE table_name (weekOfTheYear, mostRepeatedID) AS
SELECT '01', 'a' FROM DUAL UNION ALL
SELECT '01', 'b' FROM DUAL UNION ALL
SELECT '01', 'a' FROM DUAL UNION ALL
SELECT '02', 'b' FROM DUAL UNION ALL
SELECT '02', 'b' FROM DUAL UNION ALL
SELECT '02', 'a' FROM DUAL UNION ALL
SELECT '03', 'a' FROM DUAL UNION ALL
SELECT '03', 'b' FROM DUAL UNION ALL
SELECT '03', 'c' FROM DUAL;

Outputs:

WEEKOFTHEYEAR	MOSTREPEATEDID
01	a
02	b
03	a
03	b
03	c

Note: If you only want a single row-per-group then use ROW_NUMBER rather than DENSE_RANK and, if you want the minimum count then ORDER BY COUNT(*) rather than ORDER BY COUNT(*) DESC.

fiddle

CodePudding user response：

What you want to have is descendingly sorted counts of each mostRepeatedID grouped by weekOfTheYear values such as

WITH t1 AS
(  
 SELECT weekOfTheYear, mostRepeatedID, COUNT(*) AS cnt
   FROM t -- your table
  GROUP BY weekOfTheYear, mostRepeatedID 
)
SELECT DISTINCT
       weekOfTheYear,
       MAX(mostRepeatedID) KEEP (DENSE_RANK FIRST ORDER BY cnt DESC) 
                           OVER (PARTITION BY weekOfTheYear) AS mostRepeatedID
  FROM t1

But this case, there's a drawback(if it's not problem for you) that the ties might occur for those counts, then the last alphabetically ordered value is returned(here b) as MAX implies.

Demo