Sorry, this probably isn't a very good title, but I'm not sure how else to explain it.
I have a dataframe of houses in different towns:
data = [
['Oxford', 2016, True],
['Oxford', 2016, True],
['Oxford', 2018, False],
['Cambridge', 2016, False],
['Cambridge', 2016, True],
['Brighton', 2019, True],
]
df = pd.DataFrame(data, columns=['town', 'year_built', 'is_detached'])
I want to get the mean and median number of houses per town.
How can I do this?
I know how to get the mean (hackily):
len(df) / len(df.town.value_counts())
But I don't know how to get the median.
CodePudding user response:
Use value_counts
to get the number of houses per town, then agg
with 'mean'
and 'median'
:
df['town'].value_counts().agg(['mean', 'median'])
output:
mean 2.0
median 2.0
Name: town, dtype: float64