I have written a function that reshapes the numpy array per given window length. The function does the following.
My function as follows:
def w_s(df, l):
"""
Convert numpy array into desired shape with lag 1.
Args:
df (numpy.ndarray): Numpy array.
l (integer): Length of the sample window.
Returns:
Returns numpy array in a desired shape to be used in decision trees.
"""
data = np.zeros((l, 1))
data = np.append(data, df)
data = data[l:]
for i in range(1, l):
s1 = np.roll(df,0-i)
data = np.append(data,s1)
data = data.reshape(l, len(df)).T
return data[:-(l-1)]
I have two arrays with the length of 1780000
. The function takes around 3 hours.
CPU times: user 5min 20s, sys: 43min 45s, total: 49min 6s Wall time: 3h 5min 46s
My machine is Mac M1. I am running this on Jupyter cell where server runs on Firefox. How can I do this faster?
CodePudding user response:
That's sliding_window_view
:
>>> import numpy as np
>>> arr = np.arange(10)
>>> np.lib.stride_tricks.sliding_window_view(arr, 5)
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8],
[5, 6, 7, 8, 9]])