Home > Blockchain >  Select all rows with top N most frequent columns for each Group
Select all rows with top N most frequent columns for each Group

Time:05-08

For each Group, I want to get the rows with most popular columns.

Here is an example, I have table with country, state and names.

For each country, I will select the rows with top 2 most popular names. For USA, The 2 most popular names are John and Joe; For Canada they are Will and Rafe.

enter image description here

Here is the input table in text format.

Country State Name
USA CA John
USA WA Jack
USA CA Joe
USA CT Joe
USA MN John
Canada AB Joe
Canada AB Rafe
Canada AB Will
Canada BC Will
Canada BC Rafe

Also, the output table.

Country State Name
USA CA John
USA CA Joe
USA CT Joe
USA MN John
Canada AB Rafe
Canada AB Will
Canada BC Will
Canada BC Rafe

CodePudding user response:

If your mysql version supports the window function, we can use COUNT aggregate to get counting each by grouping Name and Country.

Then use self-join with ROW_NUMBER to find most top 2 popular names in each country's name.

SELECT t2.*
FROM (
  SELECT *,ROW_NUMBER() OVER(PARTITION BY Country ORDER BY cnt desc) rn 
  FROM (
    SELECT Name,Country,COUNT(*)  cnt
    FROM T
    GROUP BY Name,Country
  ) t1
) t1
INNER JOIN T t2
ON t1.Country = t2.Country AND t1.Name = t2.Name
WHERE rn <= 2

sqlfiddle

  • Related