Home > OS >  Pandas Group by and then find mean not working as expected
Pandas Group by and then find mean not working as expected

Time:07-15

Given the following table:

quarter     town    flat_type   median_rent
2011    CLEMENTI    4-RM    2250
2011    CLEMENTI    4-RM    2300
2011    CLEMENTI    4-RM    2300
2011    CLEMENTI    4-RM    2400

The median average rent for 4-RM in 2011 is (2250 2300 2300 2400)/4 = 2312.5.

My current code:

test = df.groupby(["town","quarter"], as_index=True)


print(test['median_rent'].mean())

output:

town      quarter
CLEMENTI  2011       5.625575e 14
          2012       5.875638e 14
          2013       6.500675e 14
          2014       6.875675e 14
          2015       6.625650e 14
          2016       6.500650e 14
          2017       6.250613e 14
          2018       5.750575e 14
          2019       7.667467e 10
          2020       4.800470e 18
          2021       6.000625e 14
Name: median_rent, dtype: float64

This is not what is expeced. What am I doing wrong?

CodePudding user response:

Judging by the sample code that you have written, you actually need to group by flat_type and quarter, not town and quarter. Just try

test = df.groupby(["flat_type", "quarter"], as_index=True)

and see whether it is what you were expecting.

CodePudding user response:

the groupby object should be used either to split the groupBy key and grouped df, or to apply actions directly to

example of applying mean directly to the groupby

import pandas as pd

df = pd.read_csv('tmp.csv')
print(df)

test = df.groupby(["town","quarter"], as_index=True)['median_rent'].mean()
print(test)

tmp.csv

quarter,town,flat_type,median_rent
2011,CLEMENTI,4-RM,2250
2011,CLEMENTI,4-RM,2300
2011,CLEMENTI,4-RM,2300
2011,CLEMENTI,4-RM,2400
2012,CLEMENT,4-RM,2250
2012,CLEMENT,4-RM,2300
2011,CLEMENT,4-RM,2300
2011,CLEMENT,4-RM,2400

output

   quarter      town flat_type  median_rent
0     2011  CLEMENTI      4-RM         2250
1     2011  CLEMENTI      4-RM         2300
2     2011  CLEMENTI      4-RM         2300
3     2011  CLEMENTI      4-RM         2400
4     2012   CLEMENT      4-RM         2250
5     2012   CLEMENT      4-RM         2300
6     2011   CLEMENT      4-RM         2300
7     2011   CLEMENT      4-RM         2400
town      quarter
CLEMENT   2011       2350.0
          2012       2275.0
CLEMENTI  2011       2312.5
Name: median_rent, dtype: float64
  • Related