Home > Blockchain >  How to sort from specific values of string?
How to sort from specific values of string?

Time:03-30

I'm using pandas and my data frame goes like this:

Unnamed: 0    id             player_name  games  time  goals         xG  \
0             0   647              Harry Kane     35  3097     23  22.174859   
1             1  1250           Mohamed Salah     37  3085     22  20.250847   
2             2  1228         Bruno Fernandes     37  3117     18  16.019454   
3             3   453           Son Heung-Min     37  3139     17  11.023287   
4             4   822         Patrick Bamford     38  3085     17  18.401863   
..          ...   ...                     ...    ...   ...    ...        ...  

  assists         xA  shots  key_passes  yellow_cards  red_cards position  \
0         14   7.577094    138          49             1          0        F   
1          5   6.528526    126          55             0          0    F M S   
2         12  11.474996    121          95             6          0      M S   
3         10   9.512992     68          75             0          0    F M S   
4          7   3.782247    107          30             3          0      F S   
..       ...        ...    ...         ...           ...        ...      ...  

       team_title  npg       npxG    xGChain  xGBuildup  
0            Tottenham   19  19.130183  24.995648   4.451257  
1            Liverpool   16  15.683834  28.968234   9.800236  
2    Manchester United    9   8.407840  26.911412  11.932285  
3            Tottenham   16  10.262118  20.671916   6.608751  
4                Leeds   15  16.879525  23.394953   4.131796  
..                 ...  ...        ...        ...        ...  

I'm trying to group it by team_title and sort it by goals and assists in which will become something like this:

team_title  player_name  goals  assists
Tottenham    Harry Kane  23      14
Tottenham    Gareth Bale ...     ...
Tottenham    Son Heung-Min ...   ...

I've tried using

if any('Tottenham' in db['team_title']):
    db.groupby('team_title')['player_name',''goals','assists'].value_counts()

and the error message I'm getting is

TypeError: 'bool' object is not iterable

is it correct using if-else or are there any other way to sort from specific value of string?

CodePudding user response:

If I have understand what you want to do correctly, you want to sort_values() in the DataFrame by team name (therefore grouping them all together), then by goals and assists (in this order) with highest first.

You could do this with this line of code:

db.sort_values(by=["team_title", "goals", "assists"], ascending=[True, False, False], inplace=True)

An example of its use:

import pandas as pd
db = pd.DataFrame(data={"id": [0, 1, 2, 3, 4, 5], "player_name": ["Kane", "Salag", "Fernandes", "Heung-Min", "Bamford", "PlayerX"],
                        "games": [35, 37, 37, 37, 38, 35], "time": [3097, 2085, 3117, 3139, 3085, 4000],
                        "goals": [23, 22, 18, 17, 17, 12], "assists": [14, 5, 12 ,10, 7, 9],
                        "team_title": ["Tottenham", "Liverpool", "Manchester United", "Tottenham", "Leeds", "Tottenham"]})

db.sort_values(by=["team_title", "goals", "assists"], ascending=[True, False, False], inplace=True)

By giving a list to the ascending, you can give the direction for each item you are sorting by, so in this example:

  1. Sort by team_title in ascending order (which will group all teams together).
  2. Sort by number of goals scored in descending order (highest first).
  3. Lastly, sort by number of assists in descending order.

Output with example data:

#Out: 
#   id player_name  games  time  goals  assists         team_title
#4   4     Bamford     38  3085     17        7              Leeds
#1   1       Salag     37  2085     22        5          Liverpool
#2   2   Fernandes     37  3117     18       12  Manchester United
#0   0        Kane     35  3097     23       14          Tottenham
#3   3   Heung-Min     37  3139     17       10          Tottenham
#5   5     PlayerX     35  4000     12        9          Tottenham

If you then wanted to retrieve specifically "Tottenham" data:

print(db[db["team_title"] == "Tottenham"])
#Out: 
#   id player_name  games  time  goals  assists team_title
#0   0        Kane     35  3097     23       14  Tottenham
#3   3   Heung-Min     37  3139     17       10  Tottenham
#5   5     PlayerX     35  4000     12        9  Tottenham

Your if statement doesn't work because you are effectively writing if any(True):, but any() doesn't accept a bool as an input. See this GeeksforGeeks page.

  • Related