Home > Software design >  Rolling Percentile Function outputting 0's in column?
Rolling Percentile Function outputting 0's in column?

Time:12-16

When creating a function, and using rolling( ) with the apply( ) to calculate a rolling 3 day percentile distribution, it is displaying 0's after the first 3 days for the rest of the Column.

I'm assuming that the first 2 days which have NaN Values are not being used in the calculation of the percentile function, and therefore maybe defaulting the rest of the columns to Zero, and incorrectly giving the 33 value for the third day. But im not sure about this.

I have been trying to solve this, but have not got any solution. Does anybody know why and how to solve correct this code below ? it would be greatly appreciated.

import pandas as pd
import numpy as np
from scipy import stats
data = { 'a': [1, 15, 27, 399, 17, 568, 200, 9], 
         'b': [2, 30, 15, 60, 15, 80, 53, 41],
         'c': [100,200, 3, 78, 25, 88, 300, 91],
         'd': [4, 300, 400, 500, 23, 43, 9, 71]
         }

dfgrass = pd.DataFrame(data)
def percnum(x):
    for t in dfgrass.index:
        aaa = (x<=dfgrass.loc[t,'b']).value_counts()
        ccc = (x<=dfgrass.loc[t, 'b']).values.sum()
        vvv = len(x)
        nnn = ccc/ vvv
        return nnn * 100

dfgrass['e'] = dfgrass['b'].rolling(window=3).apply(percnum)
print(dfgrass)

CodePudding user response:

Perhaps try changing for t in dfgrass.index to for t in x.index in your implementation of def percnum(x) like so:

def percnum(x):
    for t in x.index:
        aaa = (x<=dfgrass.loc[t,'b']).value_counts()
        ccc = (x<=dfgrass.loc[t, 'b']).values.sum()
        vvv = len(x)
        nnn = ccc/ vvv
        return nnn * 100

CodePudding user response:

If you're trying to compute the percentile ranks, then you can try something like

def percnum(x):
    n = len(x)
    temp = x.argsort()
    ranks = np.empty(n)
    ranks[temp] = (np.arange(n)   1) / n
    return ranks[-1]

dfgrass.rolling(3).apply(percnum)

which gives the following output

          a         b         c         d
0       NaN       NaN       NaN       NaN
1       NaN       NaN       NaN       NaN
2  1.000000  0.666667  0.333333  1.000000
3  1.000000  1.000000  0.666667  1.000000
4  0.333333  0.666667  0.666667  0.333333
5  1.000000  1.000000  1.000000  0.666667
6  0.666667  0.666667  1.000000  0.333333
7  0.333333  0.333333  0.666667  1.000000
  • Related