I have a select that returns a table such as:
weekOfTheYear | mostRepeatedID |
---|---|
01 | a |
01 | b |
01 | a |
02 | b |
02 | b |
02 | a |
and what I need is:
weekOfTheYear | mostRepeatedID |
---|---|
01 | a |
02 | b |
so that each week of the year only appears once and the mostRepeatedID for each week, is the value that appears the most.
CodePudding user response:
You can use the DENSE_RANK
analytic function to find the rows with the maximum count and then filter to only return the rows with the first rank:
SELECT weekOfTheYear,
mostRepeatedId
FROM table_name
GROUP BY
weekOfTheYear,
mostRepeatedId
ORDER BY
DENSE_RANK() OVER (
PARTITION BY weekOfTheYear
ORDER BY COUNT(*) DESC
)
FETCH FIRST ROW WITH TIES;
Which, for the sample data:
CREATE TABLE table_name (weekOfTheYear, mostRepeatedID) AS
SELECT '01', 'a' FROM DUAL UNION ALL
SELECT '01', 'b' FROM DUAL UNION ALL
SELECT '01', 'a' FROM DUAL UNION ALL
SELECT '02', 'b' FROM DUAL UNION ALL
SELECT '02', 'b' FROM DUAL UNION ALL
SELECT '02', 'a' FROM DUAL UNION ALL
SELECT '03', 'a' FROM DUAL UNION ALL
SELECT '03', 'b' FROM DUAL UNION ALL
SELECT '03', 'c' FROM DUAL;
Outputs:
WEEKOFTHEYEAR | MOSTREPEATEDID |
---|---|
01 | a |
02 | b |
03 | a |
03 | b |
03 | c |
Note: If you only want a single row-per-group then use ROW_NUMBER
rather than DENSE_RANK
and, if you want the minimum count then ORDER BY COUNT(*)
rather than ORDER BY COUNT(*) DESC
.
CodePudding user response:
What you want to have is descendingly sorted counts of each mostRepeatedID
grouped by weekOfTheYear
values such as
WITH t1 AS
(
SELECT weekOfTheYear, mostRepeatedID, COUNT(*) AS cnt
FROM t -- your table
GROUP BY weekOfTheYear, mostRepeatedID
)
SELECT DISTINCT
weekOfTheYear,
MAX(mostRepeatedID) KEEP (DENSE_RANK FIRST ORDER BY cnt DESC)
OVER (PARTITION BY weekOfTheYear) AS mostRepeatedID
FROM t1
But this case, there's a drawback(if it's not problem for you) that the ties might occur for those counts, then the last alphabetically ordered value is returned(here b
) as MAX implies.