In the following dataset:
import pandas as pd
df = pd.DataFrame({'globalid': {0: '4388064', 1: '4388200', 2: '4399344', 3: '4400638', 4: '4401765', 5: '4401831', 6: '4402098', 7: '4406997', 8: '4407331', 9: '4417043', 10: '4437380', 11: '4442467', 12: '4401955', 13: '4425140', 14: '4426164', 15: '4405473', 16: '4411249', 17: '4388584', 18: '4400483', 19: '4433927', 20: '4413441', 21: '4436355', 22: '4443361', 23: '4443375', 24: '4388176'}, 'postcode': {0: '1774PG', 1: '7481LK', 2: '1068MS', 3: '5628EN', 4: '7731TV', 5: '5971CR', 6: '9571BM', 7: '1031KA', 8: '9076BK', 9: '4465AL', 10: '1096AC', 11: '3601', 12: '2563PT', 13: '2341HN', 14: '2553DM', 15: '2403EM', 16: '1051AN', 17: '4525AB', 18: '4542BA', 19: '1096AC', 20: '5508AE', 21: '1096AC', 22: '3543GC', 23: '4105TA', 24: '7742EH'}, 'koopprijs': {0: '139000', 1: '209000', 2: '267500', 3: '349000', 4: '495000', 5: '162500', 6: '217500', 7: '655000', 8: '180000', 9: '495000', 10: '2395000', 11: '355000', 12: '150000', 13: '167500', 14: '710000', 15: '275000', 16: '498000', 17: '324500', 18: '174500', 19: '610000', 20: '300000', 21: '2230000', 22: '749000', 23: '504475', 24: '239000'}, 'place_name': {0: 'Slootdorp', 1: 'Haaksbergen', 2: 'Amsterdam', 3: 'Eindhoven', 4: 'Ommen', 5: 'Grubbenvorst', 6: '2e Exloërmond', 7: 'Amsterdam', 8: 'St.-Annaparochie', 9: 'Goes', 10: 'Amsterdam', 11: 'Maarssen', 12: "'s-Gravenhage", 13: 'Oegstgeest', 14: "'s-Gravenhage", 15: 'Alphen aan den Rijn', 16: 'Amsterdam', 17: 'Retranchement', 18: 'Hoek', 19: 'Amsterdam', 20: 'Veldhoven', 21: 'Amsterdam', 22: 'Utrecht', 23: 'Culemborg', 24: 'Coevorden'}})
print(df)
I would like to compute the average asking price, which is indicated by 'koopprijs' per place_name. Can someone please provide the code, or explain how this can be computed? As there are multiple 'koopprijs' per place_name, such as Amsterdam I am looking to compute the average price per placename.
CodePudding user response:
You can try below:
df['koopprijs'] = df['koopprijs'].astype(int) # just make sure the values are int.
df2 = df.groupby('place_name')['koopprijs'].mean()
print(df2)
You will get the output as:
place_name
's-Gravenhage 430000
2e Exloërmond 217500
Alphen aan den Rijn 275000
Amsterdam 1109250
Coevorden 239000
Culemborg 504475
Eindhoven 349000
Goes 495000
Grubbenvorst 162500
Haaksbergen 209000
Hoek 174500
Maarssen 355000
Oegstgeest 167500
Ommen 495000
Retranchement 324500
Slootdorp 139000
St.-Annaparochie 180000
Utrecht 749000
Veldhoven 300000
Name: koopprijs, dtype: int32
CodePudding user response:
First change the data type for koopprijs and then use groupby-agg
df['koopprijs'] = df['koopprijs'].astype('int')
df = df.groupby(['place_name'])['koopprijs'].agg('mean')