I have DataFrame and I'd like to make the sub sequences of the its data
d = pd.DataFrame({'t' : [1,2,3,4,5,6]})
x = []
window = 3
for i in range(0, len(d) - window 1):
x.append(d[i: i window].t.values)
pd.DataFrame(x, columns = ['t1','t2', 't3'])
I receive the result like this:
t1 t2 t3
0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6
It works but very slow for large DataFrame. Is it possible to make the procedure faster?
CodePudding user response:
You can use numpy
import pandas as pd
from numpy.lib.stride_tricks import sliding_window_view
W = 3
pd.DataFrame(sliding_window_view(d['t'], W),
columns=[f't{i 1}' for i in range(W)])
# t1 t2 t3
#0 1 2 3
#1 2 3 4
#2 3 4 5
#3 4 5 6
CodePudding user response:
You can use this trick with Pandas:
lst = []
df.rolling(3).apply(lambda x: lst.append(x.apply(int).tolist()) or 0)
result = pd.DataFrame.from_records(lst, columns=['t1','t2','t3'])
Here is the result:
t1 t2 t3
0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6