I have a data set like below:
cluster order label
0 1 1 a
1 1 2 b
2 1 3 c
3 1 4 c
4 1 5 b
5 2 1 b
6 2 2 b
7 2 3 c
8 2 4 a
9 2 5 a
10 2 6 b
11 2 7 c
12 2 8 c
I want to add a column to concatenate a rolling window of 3 for the previous values of the column label
. It seems pandas rolling can only do calculations for numerical. Is there a way to concatenate string?
cluster order label roll3
0 1 1 a NaN
1 1 2 b NaN
2 1 3 c NaN
3 1 4 c abc
4 1 5 b bcc
5 2 1 b NaN
6 2 2 b NaN
7 2 3 c NaN
8 2 4 a bbc
9 2 5 a bca
10 2 6 b caa
11 2 7 c aab
12 2 8 c abc
CodePudding user response:
Use groupby.apply
to shift
and concat the labels:
df['roll3'] = (df.groupby('cluster')['label']
.apply(lambda x: x.shift(3) x.shift(2) x.shift(1)))
# cluster order label roll3
# 0 1 1 a NaN
# 1 1 2 b NaN
# 2 1 3 c NaN
# 3 1 4 c abc
# 4 1 5 b bcc
# 5 2 1 b NaN
# 6 2 2 b NaN
# 7 2 3 c NaN
# 8 2 4 a bbc
# 9 2 5 a bca
# 10 2 6 b caa
# 11 2 7 c aab
# 12 2 8 c abc