Home > Net >  how to use pandas concatenate string within rolling window for each group?
how to use pandas concatenate string within rolling window for each group?

Time:11-30

I have a data set like below:

    cluster  order label
0         1      1     a
1         1      2     b
2         1      3     c
3         1      4     c
4         1      5     b
5         2      1     b
6         2      2     b
7         2      3     c
8         2      4     a
9         2      5     a
10        2      6     b
11        2      7     c
12        2      8     c

I want to add a column to concatenate a rolling window of 3 for the previous values of the column label. It seems pandas rolling can only do calculations for numerical. Is there a way to concatenate string?

    cluster  order label roll3
0         1      1     a   NaN
1         1      2     b   NaN
2         1      3     c   NaN
3         1      4     c   abc
4         1      5     b   bcc
5         2      1     b   NaN
6         2      2     b   NaN
7         2      3     c   NaN
8         2      4     a   bbc
9         2      5     a   bca
10        2      6     b   caa
11        2      7     c   aab
12        2      8     c   abc

CodePudding user response:

Use groupby.apply to shift and concat the labels:

df['roll3'] = (df.groupby('cluster')['label']
                 .apply(lambda x: x.shift(3)   x.shift(2)   x.shift(1)))

#     cluster  order label roll3
# 0         1      1     a   NaN
# 1         1      2     b   NaN
# 2         1      3     c   NaN
# 3         1      4     c   abc
# 4         1      5     b   bcc
# 5         2      1     b   NaN
# 6         2      2     b   NaN
# 7         2      3     c   NaN
# 8         2      4     a   bbc
# 9         2      5     a   bca
# 10        2      6     b   caa
# 11        2      7     c   aab
# 12        2      8     c   abc
  • Related