I have the following data.frame
with time
column sorted in ascending order:
colA=c(1,2,5,6,7,10,13,16,19,20,25,40,43,44,50,51,52,53,68,69,77,79,81,82)
colB=rnorm(24)
df=data.frame(time=colA, x=colB)
How can I count and take the average of the consecutive time-steps observed in the time
column?
In detail, I need to group the rows in the time
column by consecutive observations, e.g. 1,2 and 5,6,7 and 19,20 and 43,44, etc... and then take the average of the length of each group.
CodePudding user response:
You can group clusters of consecutive observations like this:
df$group <- c(0, cumsum(diff(df$time) != 1)) 1
Which gives:
df
#> time x group
#> 1 1 0.7443742 1
#> 2 2 0.1289818 1
#> 3 5 1.4882743 2
#> 4 6 -0.6626820 2
#> 5 7 -1.1606550 2
#> 6 10 0.3587742 3
#> 7 13 -0.1948464 4
#> 8 16 -0.2952820 5
#> 9 19 0.4966404 6
#> 10 20 0.4849128 6
#> 11 25 0.0187845 7
#> 12 40 0.6347746 8
#> 13 43 0.7544441 9
#> 14 44 0.8335890 9
#> 15 50 0.9657613 10
#> 16 51 1.2938800 10
#> 17 52 -0.1365510 10
#> 18 53 -0.4401387 10
#> 19 68 -1.2272839 11
#> 20 69 -0.2376531 11
#> 21 77 -0.9268582 12
#> 22 79 0.4112354 13
#> 23 81 -0.1988646 14
#> 24 82 -0.5574496 14
You can get the length of these groups by doing:
rle(df$group)$lengths
#> [1] 2 3 1 1 1 2 1 1 2 4 2 1 1 2
And the average length of the consecutive groups is:
mean(rle(df$group)$lengths)
#> [1] 1.714286
And the average of x within each group using
tapply(df$x, df$group, mean)
#> 1 2 3 4 5 6 7
#> 0.4366780 -0.1116876 0.3587742 -0.1948464 -0.2952820 0.4907766 0.0187845
#> 8 9 10 11 12 13 14
#> 0.6347746 0.7940166 0.4207379 -0.7324685 -0.9268582 0.4112354 -0.3781571
CodePudding user response:
Another way, doing some kind of consecutive length encoding:
l = diff(c(0, which(diff(df$time) != 1), nrow(df)))
l
# [1] 2 3 1 1 1 2 1 1 2 4 2 1 1 2
mean(l)
#[1] 1.714286