Home > OS >  Pandas 1 minute time series into 10 minute mean every 15 minutes
Pandas 1 minute time series into 10 minute mean every 15 minutes

Time:09-24

I have a pandas dataframe of 1 minute data that I need to resample every 15 minutes. Normally I would use

df.resample('15T').mean()

Unfortunately, I need the data averaged in 10 minute groups for each 15 minute interval. So for 17:45 the values of 17:45 to 17:55 are averaged together. For 18:00, the values for 18:00 to 18:10 are averaged.

Is there already a function within the pandas library to complete this task?

Below is a small example of what the data could look like.

Example Data In:

DateTime, Value
4/15/15 17:30,0.1
4/15/15 17:45,0.07
4/15/15 17:46,0.08
4/15/15 17:47,0.08
4/15/15 17:48,0.08
4/15/15 17:49,0.09
4/15/15 17:50,0.07
4/15/15 17:51,0.07
4/15/15 17:52,0.06
4/15/15 17:53,0.05
4/15/15 17:54,0.05
4/15/15 17:55,0.05
4/15/15 17:56,0.05
4/15/15 17:57,0.05
4/15/15 17:58,0.04
4/15/15 17:59,0.05
4/15/15 18:00,0.05
4/15/15 18:01,0.03
4/15/15 18:02,0.04
4/15/15 18:03,0.03
4/15/15 18:04,0.03
4/15/15 18:05,0.02
4/15/15 18:06,0.02
4/15/15 18:07,0.02
4/15/15 18:08,0.01
4/15/15 18:09,0.01
4/15/15 18:10,0.01
4/15/15 18:11,-0.01
4/15/15 18:12,-0.01
4/15/15 18:13,0
4/15/15 18:14,-0.02

Example data out:

DateTime, Value
4/15/15 17:30,0.1
4/15/15 17:45,0.07
4/15/15 18:00,0.02

CodePudding user response:

Try using rolling and then resample:

#if you don't have 1min data
df = df.resample("1min").last()

#shift to have a "forward-looking" window
>>> df.rolling("10min").mean().shift(-9).resample("15min").first()
                     Value
DateTime                  
2015-04-15 17:30:00  0.100
2015-04-15 17:45:00  0.070
2015-04-15 18:00:00  0.026

CodePudding user response:

You can use apply and subset the first 10 minutes/rows:

df.resample('15T').apply(lambda d: d.iloc[:10]['Value'].mean())

output:

DateTime
2015-04-15 17:30:00    0.100
2015-04-15 17:45:00    0.070
2015-04-15 18:00:00    0.026
Freq: 15T, dtype: float64

Or, if you are not sure that all groups have all the minutes:

df.resample('15T').apply(lambda d: list(d.resample('10T'))[0][1]['Value'].mean())

Explanation: for each 15T resampling group, resample to 10T and keep only the first ([0] is to take the first subgroup and [1] to get the data)

  • Related