Home > Blockchain >  Group by multiple column data frame in pandas and get mean value of a column
Group by multiple column data frame in pandas and get mean value of a column

Time:09-04

I have a dataframe like this.

input:

      Country  Year  AvgTemperature
1826  Algeria  2000  43.9
1827  Algeria  2000  46.5
.
.
7826  Algeria  2016 72.2
7827  Algeria  2016 69.4
.
.
858661 Poland 2000  63.6
858662 Poland 2000  61.9
.
.
857763 Poland 2015  34.8
857764 Poland 2015  39.2
...

I want the output to be grouped by Year and Country and mean of AvgTemperature column. So the output is like this:

      Country  Year  AvgTemperature
1826  Algeria  2000  45.5
.
.
7826  Algeria  2016 70.9
.
.
858661 Poland 2000  62.8
.
.
857763 Poland 2015  37
...

So far I have tried this:

aggregation_functions = {'AvgTemperature': 'mean'}
df_new = df.groupby(df['Year', 'Country']).aggregate(aggregation_functions)

But getting this error : KeyError: ('Year', 'Country')

CodePudding user response:

df_new = df.groupby(['Year', 'Country']).aggregate(aggregation_functions)

CodePudding user response:

# Import Module
import pandas as pd

# Data Import and Pre-Process
df = pd.DataFrame({'Country':['Algeria','Algeria','Algeria','Algeria','Poland','Poland','Poland','Poland'],
'Year':['2000','2000','2016','2016','2000','2000','2015','2015'],
'AvgTemperature':[43.9,46.5,72.2,69.4,63.6,61.9,34.8,39.2]})
df_v2 = df.groupby(['Country','Year'])['AvgTemperature'].mean().reset_index()

# Output Display
df_v2

Hi Ferdous, Please try the code above, if you have any question please let me know

Thanks Leon

  • Related