library(data.table)
DT <- data.table(
N = 1:16,
x = c(11,11,11,11,11,11,11,11,21,21,21,21,21,21,21,21),
y = c(1,2,3,4,4,4,4,4,1,2,3,4,4,4,4,4),
z = c(53,71,27,64,43,62,61,85,44,56,23,37,31,48,80,38)
)
I want to get a column with values as average of z with respect to y for each x values like
N x y z Roll Mean
1 11 1 53 NA
2 11 2 71 53
3 11 3 27 62
4 11 4 64 50.33333333
5 11 4 43 53.75
6 11 4 62 51.25
7 11 4 61 49
8 11 4 85 57.5
9 21 1 44 NA
10 21 2 56 44
11 21 3 23 50
12 21 4 37 41
13 21 4 31 40
14 21 4 48 36.75
15 21 4 80 34.75
16 21 4 38 49
for example when N=2 x=11, y=2 the roll mean = average of previous 1 z term = mean(53) when N=3, x=11, y=3 the roll mean = average of previous 1 and 2 z term = mean(53&71 ) when N=4 x=11, y=4 the roll mean = average of previous 1, 2 and 3rd z terms = mean(53,71,27 ) then N = 5 to 8 I have to get averages of previous 4 values. I wrote the code
DT[, RollingAvg := frollapply(z,4, mean), .(x)]
gives the output
N x y z RollingAvg
1: 1 11 1 53 NA
2: 2 11 1 71 NA
3: 3 11 1 27 NA
4: 4 11 1 64 53.75
5: 5 11 1 43 51.25
6: 6 11 1 62 49.00
7: 7 11 1 61 57.50
8: 8 11 1 85 62.75
9: 9 21 1 44 NA
10: 10 21 1 56 NA
11: 11 21 1 23 NA
12: 12 21 1 37 40.00
13: 13 21 1 31 36.75
14: 14 21 1 48 34.75
15: 15 21 1 80 49.00
16: 16 21 1 38 49.25
How can I get the correct out put
CodePudding user response:
We may use rollapply
with partial = TRUE
library(zoo)
library(data.table)
DT[, RollingAvg := shift(rollapply(z, 4, mean,
partial = TRUE, align = 'right')), by = x]
-output
> DT
N x y z RollingAvg
<int> <num> <num> <num> <num>
1: 1 11 1 53.00 NA
2: 2 11 2 71.00 53.00000
3: 3 11 3 27.00 62.00000
4: 4 11 4 64.00 50.33333
5: 5 11 4 43.00 53.75000
6: 6 11 4 62.00 51.25000
7: 7 11 4 61.00 49.00000
8: 8 11 1 85.00 57.50000
9: 9 21 2 44.00 NA
10: 10 21 3 56.00 44.00000
11: 11 21 4 23.00 50.00000
12: 12 21 4 37.00 41.00000
13: 13 21 4 31.00 40.00000
14: 14 21 4 48.00 36.75000
15: 15 21 1 80.38 34.75000
16: 16 21 2 53.00 49.09500
In addition, frollmean
can have adaptive
option if we specify n
as a vector of values
DT[, RollingAvg := shift(frollmean(z, rep(1:4, c(1, 1, 1, .N-3)),
adaptive = TRUE)), by = x]
-output
> DT
N x y z RollingAvg
<int> <num> <num> <num> <num>
1: 1 11 1 53.00 NA
2: 2 11 2 71.00 53.00000
3: 3 11 3 27.00 62.00000
4: 4 11 4 64.00 50.33333
5: 5 11 4 43.00 53.75000
6: 6 11 4 62.00 51.25000
7: 7 11 4 61.00 49.00000
8: 8 11 1 85.00 57.50000
9: 9 21 2 44.00 NA
10: 10 21 3 56.00 44.00000
11: 11 21 4 23.00 50.00000
12: 12 21 4 37.00 41.00000
13: 13 21 4 31.00 40.00000
14: 14 21 4 48.00 36.75000
15: 15 21 1 80.38 34.75000
16: 16 21 2 53.00 49.09500
CodePudding user response:
DT[, rm := shift(as.data.frame(frollmean(z, 1:4))[cbind(1:.N, y)])]
# OR
DT[, rm := shift(unlist(frollmean(z, 1:4))[.I (y-1)*.N])]
DT[y == 1L, rm := NA_real_]
# N x y z rm
# <int> <num> <num> <num> <num>
# 1: 1 11 1 53 NA
# 2: 2 11 2 71 53.00000
# 3: 3 11 3 27 62.00000
# 4: 4 11 4 64 50.33333
# 5: 5 11 4 43 53.75000
# 6: 6 11 4 62 51.25000
# 7: 7 11 4 61 49.00000
# 8: 8 11 4 85 57.50000
# 9: 9 21 1 44 NA
# 10: 10 21 2 56 44.00000
# 11: 11 21 3 23 50.00000
# 12: 12 21 4 37 41.00000
# 13: 13 21 4 31 40.00000
# 14: 14 21 4 48 36.75000
# 15: 15 21 4 80 34.75000
# 16: 16 21 4 38 49.00000
Properly reproducible data:
DT <- data.table(
N = 1:16,
x = c(11,11,11,11,11,11,11,11,21,21,21,21,21,21,21,21),
y = c(1,2,3,4,4,4,4,4,1,2,3,4,4,4,4,4),
z = c(53,71,27,64,43,62,61,85,44,56,23,37,31,48,80,38)
)