I have this dataframe, i'd like to have a new one that for every country I have the 'Count' mean of the set of the years
country Alpha-3 code Year Count
0 Australia AUS 2005 2.000000
1 Austria AUT 2005 1.000000
2 Belgium BEL 2005 0.000000
3 Canada CAN 2005 4.000000
4 China CHN 2005 0.000000
5 Australia AUS 2006 4.000000
6 Austria AUT 2006 1.000000
7 Belgium BEL 2006 1.000000
8 Canada CAN 2006 6.000000
9 China CHN 2006 2.000000
10 Australia AUS 2007 5.000000
11 Austria AUT 2007 0.000000
12 Belgium BEL 2007 2.000000
13 Canada CAN 2007 5.000000
14 China CHN 2007 3.000000
15 Australia AUS 2008 7.000000
16 Austria AUT 2008 0.000000
17 Belgium BEL 2008 1.000000
18 Canada CAN 2008 5.000000
19 China CHN 2008 3.000000
I'd like to have a thing like this:
country Count
Australia 4.5
Austria 0.5
ecc.
Thanks in advance
CodePudding user response:
To calculate the mean of whole columns in the DataFrame, use pandas.Series.mean()
with a list of DataFrame columns. You can also get the mean for all numeric columns using DataFrame.mean(), use axis=0
argument to calculate the column-wise mean of the DataFrame.
for more you can visit this link : https://sparkbyexamples.com/pandas/pandas-get-column-average-mean/ or https://www.statology.org/pandas-average-selected-columns/
CodePudding user response:
You can use pandas.DataFrame.groupby
for this.
out1 = df.groupby("country", as_index=False)["Count"].mean() #to return a dataframe
out2 = df.groupby("country")["Count"].mean() #to return a serie
Output :
print(out1)
country Count
0 Australia 4.5
1 Austria 0.5
2 Belgium 1.0
3 Canada 5.0
4 China 2.0