I've used rnorm to simulate n=100 draws from a normal. What I want to do now is to calculate the mean of "subsequences" of the data, i.e. mean of elements 1:10, 1:20, 1:30, ..., 1:100.
How can I do that using loop which just saves the calculated means instead of first creating subsets and consequently calculating means?
CodePudding user response:
Create a splitting vector f
using a cumsum
trick and tapply
or aggregate
with function mean
.
set.seed(2022)
x <- rnorm(100)
k <- 10L
f <- c(1, rep(0, k - 1L))
f <- rep(f, length.out = length(x))
f <- cumsum(f)
tapply(x, f, mean)
#> 1 2 3 4 5 6
#> -0.563706565 0.007282962 0.208630598 0.063939372 -0.360526835 0.622263561
#> 7 8 9 10
#> -0.096927090 0.753811231 0.462543860 0.290149022
aggregate(x ~ f, FUN = mean)
#> f x
#> 1 1 -0.563706565
#> 2 2 0.007282962
#> 3 3 0.208630598
#> 4 4 0.063939372
#> 5 5 -0.360526835
#> 6 6 0.622263561
#> 7 7 -0.096927090
#> 8 8 0.753811231
#> 9 9 0.462543860
#> 10 10 0.290149022
Created on 2022-04-08 by the reprex package (v2.0.1)
Edit
I misread the problem, like it is said in a OP comment,
Thanks! The problem is that this calculates means from 1:10, 11:20, 21:30 but I would actually need 1:10, 1:20, 1:30 until all 100 elements
g <- seq(10L, length(x), by = 10L)
sapply(g, \(k) mean(x[seq_len(k)]))
#> [1] -0.563706565 -0.278211802 -0.115931002 -0.070963408 -0.128876094
#> [6] -0.003686151 -0.017006285 0.079345904 0.121923455 0.138746012
Created on 2022-04-08 by the reprex package (v2.0.1)
CodePudding user response:
set.seed(2022)
x <- rnorm(100)
mn <- colMeans(matrix(x, 10))
cumsum(mn)/seq(mn)
[1] -0.563706565 -0.278211802 -0.115931002 -0.070963408 -0.128876094
[6] -0.003686151 -0.017006285 0.079345904 0.121923455 0.138746012