I have a pandas dataframe of 1 minute data that I need to resample every 15 minutes. Normally I would use
df.resample('15T').mean()
Unfortunately, I need the data averaged in 10 minute groups for each 15 minute interval. So for 17:45 the values of 17:45 to 17:55 are averaged together. For 18:00, the values for 18:00 to 18:10 are averaged.
Is there already a function within the pandas library to complete this task?
Below is a small example of what the data could look like.
Example Data In:
DateTime, Value
4/15/15 17:30,0.1
4/15/15 17:45,0.07
4/15/15 17:46,0.08
4/15/15 17:47,0.08
4/15/15 17:48,0.08
4/15/15 17:49,0.09
4/15/15 17:50,0.07
4/15/15 17:51,0.07
4/15/15 17:52,0.06
4/15/15 17:53,0.05
4/15/15 17:54,0.05
4/15/15 17:55,0.05
4/15/15 17:56,0.05
4/15/15 17:57,0.05
4/15/15 17:58,0.04
4/15/15 17:59,0.05
4/15/15 18:00,0.05
4/15/15 18:01,0.03
4/15/15 18:02,0.04
4/15/15 18:03,0.03
4/15/15 18:04,0.03
4/15/15 18:05,0.02
4/15/15 18:06,0.02
4/15/15 18:07,0.02
4/15/15 18:08,0.01
4/15/15 18:09,0.01
4/15/15 18:10,0.01
4/15/15 18:11,-0.01
4/15/15 18:12,-0.01
4/15/15 18:13,0
4/15/15 18:14,-0.02
Example data out:
DateTime, Value
4/15/15 17:30,0.1
4/15/15 17:45,0.07
4/15/15 18:00,0.02
CodePudding user response:
Try using rolling
and then resample
:
#if you don't have 1min data
df = df.resample("1min").last()
#shift to have a "forward-looking" window
>>> df.rolling("10min").mean().shift(-9).resample("15min").first()
Value
DateTime
2015-04-15 17:30:00 0.100
2015-04-15 17:45:00 0.070
2015-04-15 18:00:00 0.026
CodePudding user response:
You can use apply
and subset the first 10 minutes/rows:
df.resample('15T').apply(lambda d: d.iloc[:10]['Value'].mean())
output:
DateTime
2015-04-15 17:30:00 0.100
2015-04-15 17:45:00 0.070
2015-04-15 18:00:00 0.026
Freq: 15T, dtype: float64
Or, if you are not sure that all groups have all the minutes:
df.resample('15T').apply(lambda d: list(d.resample('10T'))[0][1]['Value'].mean())
Explanation: for each 15T resampling group, resample to 10T and keep only the first ([0]
is to take the first subgroup and [1]
to get the data)