I want to iterate over a large list wherein I need to do some computations using n
elements before the Nth
index of the large list. I've solved it using the following code snippet.
mylist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
for i in range(len(mylist)):
j=i 3
data_till_i = mylist[:j]
current_window = data_till_i[-3:]
print(current_window)
I get the following from the above code snippet:
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
[4, 5, 6]
[5, 6, 7]
[6, 7, 8]
[7, 8, 9]
[8, 9, 10]
[9, 10, 11]
[10, 11, 12]
[11, 12, 13]
[12, 13, 14]
[12, 13, 14]
[12, 13, 14]
Is there any one liner or more efficient way to do the exact same thing that will take less computation time? As my list size is very large (list has length > 100K
), I'm worried about time complexity.
Thank you.
CodePudding user response:
You can try sliding_window_view
import numpy as np
n = 3
mylist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
window = np.lib.stride_tricks.sliding_window_view(mylist, n)
out = np.append(window, [window[-1] for _ in range(n-1)], axis=0)
print(out)
[[ 1 2 3]
[ 2 3 4]
[ 3 4 5]
[ 4 5 6]
[ 5 6 7]
[ 6 7 8]
[ 7 8 9]
[ 8 9 10]
[ 9 10 11]
[10 11 12]
[11 12 13]
[12 13 14]
[12 13 14]
[12 13 14]]
For one liner, if your Python version is greater than 3.8.0, you can try the walrus operator
out = np.append((window := np.lib.stride_tricks.sliding_window_view(mylist, n)),
[window[-1] for _ in range(n-1)], axis=0)
CodePudding user response:
List comprehension for example? (use numpy arrays for fater iteration)
import numpy as np
mylist = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14])
chunk_size = 3
splited_list = np.array([mylist[x:x chunk_size] for x in range(0,len(mylist)-chunk_size)])
You can cast the result to numpy array or cast every item on the list to a simple python list.
CodePudding user response:
What you're after is called a rolling window operation. If you want to work on list
type specifically, there is a shorter formulation using islice
as proposed here:
window_size = 3
for i in range(len(mylist) - window_size 1):
print(mylist[i: i window_size])
If your data is numerical, as in the example, I'd rather propose to use numpy
as this will give you much better performance! Using the proposal from here, your example becomes:
from numpy.lib.stride_tricks import sliding_window_view
sliding_window_view(np.array(mylist), window_shape = 3)
To give you a feeling for the timing, we can turn the options above into functions, create a much longer list, and compare the timing using timeit
e.g. in Jupyter:
def rolling_window_using_iterator(list_, window_size):
result = []
for i in range(len(list_) - window_size 1):
result.append(list_[i: i window_size])
return result
def rolling_window_using_numpy(list_, window_size):
return sliding_window_view(np.array(list_), window_shape = 3)
long_list = list(range(10000000))
%timeit rolling_window_using_iterator(long_list, 3)
%timeit rolling_window_using_numpy(long_list, 3)
prints (on my machine):
1.8 s ± 22 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
422 ms ± 967 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
CodePudding user response:
I tried this way, it iterate the list in less than a second
myList = [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
for index, val in enumerate(myList):
if index >= 3 :print("{} : {}".format(index, myList[index-3:index]))
The "list[index-3:index]" allow to slice the list from the nth-3 element to the nth element.
Hope it helps