Home > Back-end >  Computing the average
Computing the average

Time:11-24

In the following dataset:

import pandas as pd

df = pd.DataFrame({'globalid': {0: '4388064', 1: '4388200', 2: '4399344', 3: '4400638', 4: '4401765', 5: '4401831', 6: '4402098', 7: '4406997', 8: '4407331', 9: '4417043', 10: '4437380', 11: '4442467', 12: '4401955', 13: '4425140', 14: '4426164', 15: '4405473', 16: '4411249', 17: '4388584', 18: '4400483', 19: '4433927', 20: '4413441', 21: '4436355', 22: '4443361', 23: '4443375', 24: '4388176'}, 'postcode': {0: '1774PG', 1: '7481LK', 2: '1068MS', 3: '5628EN', 4: '7731TV', 5: '5971CR', 6: '9571BM', 7: '1031KA', 8: '9076BK', 9: '4465AL', 10: '1096AC', 11: '3601', 12: '2563PT', 13: '2341HN', 14: '2553DM', 15: '2403EM', 16: '1051AN', 17: '4525AB', 18: '4542BA', 19: '1096AC', 20: '5508AE', 21: '1096AC', 22: '3543GC', 23: '4105TA', 24: '7742EH'}, 'koopprijs': {0: '139000', 1: '209000', 2: '267500', 3: '349000', 4: '495000', 5: '162500', 6: '217500', 7: '655000', 8: '180000', 9: '495000', 10: '2395000', 11: '355000', 12: '150000', 13: '167500', 14: '710000', 15: '275000', 16: '498000', 17: '324500', 18: '174500', 19: '610000', 20: '300000', 21: '2230000', 22: '749000', 23: '504475', 24: '239000'}, 'place_name': {0: 'Slootdorp', 1: 'Haaksbergen', 2: 'Amsterdam', 3: 'Eindhoven', 4: 'Ommen', 5: 'Grubbenvorst', 6: '2e Exloërmond', 7: 'Amsterdam', 8: 'St.-Annaparochie', 9: 'Goes', 10: 'Amsterdam', 11: 'Maarssen', 12: "'s-Gravenhage", 13: 'Oegstgeest', 14: "'s-Gravenhage", 15: 'Alphen aan den Rijn', 16: 'Amsterdam', 17: 'Retranchement', 18: 'Hoek', 19: 'Amsterdam', 20: 'Veldhoven', 21: 'Amsterdam', 22: 'Utrecht', 23: 'Culemborg', 24: 'Coevorden'}})

print(df)

I would like to compute the average asking price, which is indicated by 'koopprijs' per place_name. Can someone please provide the code, or explain how this can be computed? As there are multiple 'koopprijs' per place_name, such as Amsterdam I am looking to compute the average price per placename.

CodePudding user response:

You can try below:

df['koopprijs'] = df['koopprijs'].astype(int)  # just make sure the values are int.
df2 = df.groupby('place_name')['koopprijs'].mean()
print(df2)

You will get the output as:

place_name
's-Gravenhage           430000
2e Exloërmond           217500
Alphen aan den Rijn     275000
Amsterdam              1109250
Coevorden               239000
Culemborg               504475
Eindhoven               349000
Goes                    495000
Grubbenvorst            162500
Haaksbergen             209000
Hoek                    174500
Maarssen                355000
Oegstgeest              167500
Ommen                   495000
Retranchement           324500
Slootdorp               139000
St.-Annaparochie        180000
Utrecht                 749000
Veldhoven               300000
Name: koopprijs, dtype: int32

CodePudding user response:

First change the data type for koopprijs and then use groupby-agg

df['koopprijs'] = df['koopprijs'].astype('int')

df = df.groupby(['place_name'])['koopprijs'].agg('mean')
  • Related