Home > Blockchain >  How to effeciently average value in between precceeding and succesing row index with Pandas
How to effeciently average value in between precceeding and succesing row index with Pandas

Time:09-21

The objective was to get an average for the n-preceeding and n-succeding for a given index row.

For a given index, the get average of a list of index. For example

index   index of average

0   0,1,2
1   0,1,2,3
2   0,1,2,3,4
...
9   7,8,9,10,11
10  8,9,10,11,12

This can be achieved as below:

import pandas as pd
arr=[[6772],
[7182],
[8570],
[11078],
[11646],
[13426],
[16996],
[17514],
[18408],
[22128],
[22520],
[23532],
[26164],
[26590],
[30636],
[3119],
[32166],
[34774]]
df=pd.DataFrame(arr,columns=['a'])
df['cal']=0
idx_c=2
for idx in range(len(df)):
    idx_l=idx-idx_c
    idx_t=idx idx_c
    idx_l=0 if idx_l<0 else idx_l
    idx_t=len(df) if idx_t>len(df) else idx_t
    df.loc[idx,'cal']=df['a'][df.index.isin(range(idx_l,idx_t 1))].mean()

However, I wonder there is more efficient way of achieving the above task?

CodePudding user response:

Series.rolling

The trick here is to use a rolling window of size 2*w 1 with the optional parameter center=True to center the result of rolling computation. For example, if w=2 then the window size would be 2*w 1 = 5 and result of rolling computation will be stored at the position 3.

w = 2
df['avg'] = df['a'].rolling(2 * w   1, center=True, min_periods=1).mean()

print(df)

        a       avg
0    6772   7508.00
1    7182   8400.50
2    8570   9049.60
3   11078  10380.40
4   11646  12343.20
5   13426  14132.00
6   16996  15598.00
7   17514  17694.40
8   18408  19513.20
9   22128  20820.40
10  22520  22550.40
11  23532  24186.80
12  26164  25888.40
13  26590  22008.20
14  30636  23735.00
15   3119  25457.00
16  32166  25173.75
17  34774  23353.00
  • Related