Home > Blockchain >  How to group by Max?
How to group by Max?

Time:04-02

I'm having some trouble with a statement, I have a table like this

Player Number Sport Date
1 soccer 4/1
4 basketball 4/2
2 basketball 4/3
3 soccer 4/1
1 baseball 4/2
4 basketball 4/3
5 soccer 4/1
3 baseball 4/2
5 soccer 4/2
2 basketball 4/3
1 soccer 4/4
2 baseball 4/5
3 soccer 4/4
4 soccer 4/5
5 soccer 4/3
5 basketball 4/5

I'm trying to write a query that will give me a table that will group together the players by their most played sport so the end result will look like this:

Sport # of Players Who played the sport the most
Baseball 2
Basketball 1
Soccer 2

I was able to write a statement that shows the number of times each player played a a specific sport but am having trouble getting the table to only select the rows with the highest number of plays and arranging it to be grouped by each sport.

Select distinct PLAYER_NUMBER, SPORT, Max(GAMES_PLAYED) as GAMES_PLAYED
FROM #temp
Group By SPORT, PLAYER_NUMER
order by GAMES_PLAYED desc 

CodePudding user response:

I don't think your desired results match your sample data?

Anyway start with a query to get the number of plays per player, and use the row_number window function to determine which sport was played the most for each player (note what do you want to do in the case of a tie?). Then in the next query only use the sport with the most plays per player and group by sport.

with cte as (
    select *
        -- For each player get the sport with the most plays
        , row_number() over (partition by Player order by count(*) desc) rn
    from #temp
    group by Player, Sport
)
select Sport
    , count(*) [# of Players Who played the sport the most]
from cte
where rn = 1
group by Sport
order by Sport;

Returns:

Sport # of Players Who played the sport the most
basketball 2
soccer 3

The following related query allows one to confirm the results shown above:

with cte as (
    select *
        , count(*) num
        , row_number() over (partition by Player order by count(*) desc) rn
    from #temp
    group by Player, Sport
)
select Player, Sport, num
from cte
order by Player, num desc;

Which shows for each player which sport was played the most i.e. its the first row for each player, and note baseball never shows up as the most played sport! The actual results are calculated using every rn=1 because that is the sport most played per player.

Player Sport num rn
1 soccer 2 1
1 baseball 1 2
2 basketball 2 1
2 baseball 1 2
3 soccer 2 1
3 baseball 1 2
4 basketball 2 1
4 soccer 1 2
5 soccer 3 1
5 basketball 1 2

CodePudding user response:

If I understand you right, you might also get this by subqueries

select s.sport,
       count(distinct s.playdate) as times_played,
       (select max(t.players) from (
          select count(distinct s2.player) as players
          from   sports s2 
          where  s2.sport = s.sport 
          group by s2.sport, s2.playdate
          ) t
       ) as nbr_players
from   sports s
group by s.sport

this gets me

sport times_played nbr_players
baseball 2 2
basketball 3 2
soccer 5 3

DBFiddle here

  • Related