Home > Blockchain >  Resampling and computing mean in pandas dataframe
Resampling and computing mean in pandas dataframe

Time:11-12

I have a pandas data frame with 1 column and a time-based index. I want to resample the data for every two seconds and compute the average of the values in the column. Here is an example:

index = pd.date_range('1/1/2000', periods=10, freq='S')
data = {'value':[23,23,12,14,14,57,67,32,56,89]}
series = pd.DataFrame(data, index=index)

The above code gives this result.

enter image description here

Now, I compute the average of the values for every two seconds.

series['resample_value'] = 
series['value'].resample('2S').mean()

This gives me the result as shown in the image below.

enter image description here

But I would like to have the results as shown in image 3. I would like the computed average values to be put back in the original dataframe which was not resampled. How do I obtain it?

enter image description here

Thanks in advance.

CodePudding user response:

You could use series.to_frame():

df = series['value'].resample('2S').mean().to_frame()

Your line series = pd.DataFrame(data, index=index) creates a dataframe, not a series.

CodePudding user response:

You can groupby the resampled value using floor, then calculate means for each group and broadcast this to the original rows using transform:

series['resample_value'] = series.groupby(series.index.floor('2S')).value.transform('mean')

print(series)

                     value  resample_value
2000-01-01 00:00:00     23            23.0
2000-01-01 00:00:01     23            23.0
2000-01-01 00:00:02     12            13.0
2000-01-01 00:00:03     14            13.0
2000-01-01 00:00:04     14            35.5
2000-01-01 00:00:05     57            35.5
2000-01-01 00:00:06     67            49.5
2000-01-01 00:00:07     32            49.5
2000-01-01 00:00:08     56            72.5
2000-01-01 00:00:09     89            72.5
  • Related