Home > Software engineering >  Efficient way to compute Aroon indicator in pandas
Efficient way to compute Aroon indicator in pandas

Time:03-07

I have to compute the Aroon indicator on data stored in a dataframe:

import pandas as pd
import numpy as np


N = 100000
np.random.seed(42)


df = pd.DataFrame()
df['Time'] = np.arange(1, N   1, 1)
df['High'] = 10   np.sin(2*np.pi/(N/2)*df['Time'])   0.5*np.random.randn(N)
df['Low'] = df['High'] - (0.1*np.random.randn(N)   1)**2
   Time       High       Low
0     1  10.248483  9.031743
1     2   9.931119  9.148842
2     3  10.324221  9.205823
3     4  10.762018  9.882031
4     5   9.883552  8.947960
5     6   9.883686  8.874142
6     7  10.790486  9.814241
7     8  10.384723  9.691851
8     9   9.766394  8.470937
9    10  10.272537  9.032786

Following this answer, I can use:

n = 25
df['Aroon Up'] = 100*df['High'].rolling(n   1).apply(lambda x: x.argmax())/n
df['Aroon Down'] = 100*df['Low'].rolling(n   1).apply(lambda x: x.argmin())/n

Which is pretty fine, but it is very slow on the dataframe over which I have to operate, over 500.000 rows.
How can I speed up the Aroon indicator computation?

CodePudding user response:

You can use sliding_window_view as replacement of rolling:

aroon_up = 100 * sliding_window_view(df['High'], n 1).argmax(1) / n
aroon_down = 100 * sliding_window_view(df['Low'], n 1).argmin(1) / n

# The original dimensions are trimmedas required by the size of the sliding window
df['Aroon Up'] = np.hstack([[np.nan]*n, aroon_up])
df['Aroon Down'] = np.hstack([[np.nan]*n, aroon_down])

For 500K records:

%timeit 100 * sliding_window_view(df['High'], n 1).argmax(1) / n
31.8 ms ± 482 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit 100*df['High'].rolling(n   1).apply(lambda x: x.argmax())/n
30.7 s ± 412 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

CodePudding user response:

Here is a numba version

import numpy as np
from numba import jit

@jit(nopython=True)
def aroon(data, period):
    size = len(data)
    out_up = np.array([np.nan] * size)
    out_down = np.array([np.nan] * size)
    for i in range(period - 1, size):
        window = np.flip(data[i   1 - period:i   1])
        out_up[i] = ((period - window.argmax()) / period) * 100
        out_down[i] = ((period - window.argmin()) / period) * 100
    return out_up, out_down
  • Related