Home > Mobile >  Find the sum of a column by grouping two columns
Find the sum of a column by grouping two columns

Time:04-28

For this dataset, i want to find the sum of Value(£) for each combination of the three columns together for Year, Length Group and Port of Landing. So for example, one sum value will be for the year 2016, the Length group 10m&Under and the Port of Landing Aberdaran.

enter image description here

CodePudding user response:

Given the response you have back to @berkayln, I think you want to project that column back to your original dataframe... Does this suit your need ?

df['sumPerYearLengthGroupPortOfLanding']=df.groupby(['Year','Length Group','Port of Landing'])['Value(£)'].transform(lambda x: x.sum())

CodePudding user response:

You can try this one:

dataframe.groupby(['Year','Length Group','Port of Landing'])['Value(£)'].sum()

That should work.

CodePudding user response:

You can use pd.DataFrame.groupby to aggregate the data.

# Change the order if you want a different hierarchy 
grp_cols = ["Year", "Length Group", "Port of Landing"]
df.groupby(grp_cols)["Value(£)"].sum()

You can also do them one-by-one as such:

for col in grp_cols:
    df.groupby(col)["Value(£)"].sum()

You can also use .loc to get 2016 only.

df.loc[df.Year == 2016]["Value(£)"].sum()

The pd.DataFrame.groupby functionality allows you to aggregate using other functions other than .sum, including customized functions that operate on the sub-dataframes.

  • Related