I have a dataframe with more than 500 cities which look like this
city | value | datetime |
---|---|---|
london | 23 | 2022-03-25 17:59:18 |
dubai | 12 | 2022-03-25 17:59:36 |
berlin | 5 | 2022-03-25 17:59:42 |
london | 25 | 2022-03-25 18:01:18 |
dubai | 12 | 2022-03-25 18:02:18 |
berlin | 5 | 2022-03-25 18:03:18 |
I have a function called rolling_mean which creates a new column 'rolling_mean' which calculates the last hour rolling average.
def rolling_mean(df):
df['rolling_mean'] = (df.set_axis(datetime)
.rolling('1h')['value']
.mean()
.set_axis(df.index)
)
However I would like to apply this function to each city separately so that when the new rolling_mean column is created, the rolling average don't conflict with different cities. Since there are almost 500 cities in the dataframe. I am not sure how to do this.
CodePudding user response:
You can do it with groupby methods
df.groupby('city').apply(rolling_mean)