Home > other >  Sum multiple columns to a two column output/dataframe
Sum multiple columns to a two column output/dataframe

Time:05-24

I am exploring a rugby dataset where I am stuck on a bit of an issue. The dataset contains matches played in the the Six Nations over a year, and the tries are put into two separate columns, which is "Home try's" and "Away try's". I am trying to print out the output of the teams with the largest amount of try`s. Below is an example of how the dataset looks.

Home Away Home try's Away try's
England France 3 2
Ireland Scotland 1 4
France Ireland 1 0
Scotland England 1 3

I have tried to put the home and away columns to a single column as a list and use the arguments df.explode and df.groupby argument to create a dataframe, but it doesnt work. Any tips on how I could solve this?

The ideal output would be:

England Scotland France Ireland
6 5 3 1

CodePudding user response:

How about this?

series_home = df.groupby('Home').sum()["Home try's"]
series_away = df.groupby('Away').sum()["Away try's"]
combined_series = series_home   series_away
combined_df = pd.DataFrame(combined_series).transpose()

CodePudding user response:

I was able to do this in a way combining the two DFs and using a pivot.

df_Home = df[['Home', 'Home_Try']]
df_Away = df[['Away', 'Away_Try']]
df_Away.columns = [['Team', 'Tries']]
df_Home.columns = [['Team', 'Tries']]
df2 = pd.concat([df_Home, df_Away])
df2.columns = ['Team', 'Tries']
df_pivot = pd.pivot_table(df2, values='Tries', columns=['Team'], aggfunc=np.sum)
df_pivot
  • Related