Home > Back-end >  How to use aggregate and apply lambda at the same time in pandas group by?
How to use aggregate and apply lambda at the same time in pandas group by?

Time:05-29

In this dataframe:

dataframe

I want to find number of home wins and number of home matches for each team in each season. This is my code:

viewmatches.groupby(["season","home_team"]).apply(lambda x:x[x["home_team_goal"]>x["away_team_goal"]]["home_team"].count()).reset_index(name="home_wins")
viewmatches.groupby(["season","home_team"]).apply(lambda x:x["home_team"].count()).reset_index(name="home_matches")

result for the first line is this:

result#1

and the result for the second line is this:

result#2

So my question is how can I use groupby agg function to get this result:

result

CodePudding user response:

I'm not sure how this could be done using an aggregation function, but it seems like what you want to achieve is to just join the tables. This can be done like so:

A = pd.DataFrame({'K':list(range(4)), 'M':list(range(4)), 'A':['a','b','c', 'd']})
B = pd.DataFrame({'K':list(range(4)), 'M':list(range(4)), 'B':['r','x','y', 'z']})

print(A.join(B.set_index(['K', 'M']), on=['K', 'M']))
   K  M  A  B
0  0  0  a  r
1  1  1  b  x
2  2  2  c  y
3  3  3  d  z

CodePudding user response:

Here's a way to use agg to get both desired columns at once:

x = viewmatches.assign(
    home_wins=viewmatches["home_team_goal"]>viewmatches["away_team_goal"], 
    home_matches=1)[["season","home_team","home_wins","home_matches"]].groupby(
    ["season","home_team"]).agg(sum).reset_index()

Note that this is equivalent to using sum() in place of agg(sum):

x = viewmatches.assign(
    home_wins=viewmatches["home_team_goal"]>viewmatches["away_team_goal"], 
    home_matches=1)[["season","home_team","home_wins","home_matches"]].groupby(
    ["season","home_team"]).sum().reset_index()

Explanation:

  • create the home_wins and home_matches columns in the original dataframe
  • call groupby and sum
  • use reset_index to reassign season and home_team from index levels to columns

Full test code:

import pandas as pd
viewmatches = pd.DataFrame({
'season':[1,1,1,1,1]*3   [2,2,2,2,2]*3,
'home_team':['1. FC Koln', 'ADO Den Haag', 'AJ Auxerre', 'AS Monaco', 'AS Nancy-Lorraine']*3   ['1. FC Koln', 'ADO Den Haag', 'AJ Auxerre', 'AS Monaco', 'AS Nancy-Lorraine']*3,
'home_team_goal':[0,0,0,0,0] [1,1,1,1,1] [2,2,2,2,2] [3,3,3,3,3] [4,4,4,4,4] [5,5,5,5,5],
'away_team_goal':[0,1,2,3,4]*6,
'other_column_not_needed_in_result_1':[None]*30,
'other_column_not_needed_in_result_2':[None]*30,
'other_column_not_needed_in_result_3':[None]*30
})
print(viewmatches)
x = viewmatches.assign(
    home_wins=viewmatches["home_team_goal"]>viewmatches["away_team_goal"], 
    home_matches=1)[["season","home_team","home_wins","home_matches"]].groupby(
    ["season","home_team"]).agg(sum).reset_index()
print(x)

Input:

    season          home_team  home_team_goal  away_team_goal other_col_1 other_col_2 other_col_3
0        1         1. FC Koln               0               0        None        None        None
1        1       ADO Den Haag               0               1        None        None        None
2        1         AJ Auxerre               0               2        None        None        None
3        1          AS Monaco               0               3        None        None        None
4        1  AS Nancy-Lorraine               0               4        None        None        None
5        1         1. FC Koln               1               0        None        None        None
6        1       ADO Den Haag               1               1        None        None        None
7        1         AJ Auxerre               1               2        None        None        None
8        1          AS Monaco               1               3        None        None        None
9        1  AS Nancy-Lorraine               1               4        None        None        None
10       1         1. FC Koln               2               0        None        None        None
11       1       ADO Den Haag               2               1        None        None        None
12       1         AJ Auxerre               2               2        None        None        None
13       1          AS Monaco               2               3        None        None        None
14       1  AS Nancy-Lorraine               2               4        None        None        None
15       2         1. FC Koln               3               0        None        None        None
16       2       ADO Den Haag               3               1        None        None        None
17       2         AJ Auxerre               3               2        None        None        None
18       2          AS Monaco               3               3        None        None        None
19       2  AS Nancy-Lorraine               3               4        None        None        None
20       2         1. FC Koln               4               0        None        None        None
21       2       ADO Den Haag               4               1        None        None        None
22       2         AJ Auxerre               4               2        None        None        None
23       2          AS Monaco               4               3        None        None        None
24       2  AS Nancy-Lorraine               4               4        None        None        None
25       2         1. FC Koln               5               0        None        None        None
26       2       ADO Den Haag               5               1        None        None        None
27       2         AJ Auxerre               5               2        None        None        None
28       2          AS Monaco               5               3        None        None        None
29       2  AS Nancy-Lorraine               5               4        None        None        None

Output:

   season          home_team  home_wins  home_matches
0       1         1. FC Koln          2             3
1       1       ADO Den Haag          1             3
2       1         AJ Auxerre          0             3
3       1          AS Monaco          0             3
4       1  AS Nancy-Lorraine          0             3
5       2         1. FC Koln          3             3
6       2       ADO Den Haag          3             3
7       2         AJ Auxerre          3             3
8       2          AS Monaco          2             3
9       2  AS Nancy-Lorraine          1             3
  • Related