In R say I had the dataframe:
frame object positive
1 6 0
2 6 1
3 6 1
4 6 1
5 6 1
6 6 0
7 6 0
8 6 1
9 6 1
10 6 1
1 7 1
2 7 1
3 7 1
4 7 1
5 7 1
6 7 0
7 7 1
8 7 0
9 7 1
10 7 1
I am trying to create a new table which counts the consecutive occurrences of the value of 1 in the positive column for each separate object and outputs the maximum and mean consecutive occurrences. Which would look like :
object max mean
6 4 3.5
7 5 8/3
Thank you for your help!
CodePudding user response:
Here is a solution which uses data.table::rleid
to find consecutive occurrences of 1s.
library("tidyverse")
df <- tibble::tribble(
~frame, ~object, ~positive,
1L, 6L, 0L,
2L, 6L, 1L,
3L, 6L, 1L,
4L, 6L, 1L,
5L, 6L, 1L,
6L, 6L, 0L,
7L, 6L, 0L,
8L, 6L, 1L,
9L, 6L, 1L,
10L, 6L, 1L,
1L, 7L, 1L,
2L, 7L, 1L,
3L, 7L, 1L,
4L, 7L, 1L,
5L, 7L, 1L,
6L, 7L, 0L,
7L, 7L, 1L,
8L, 7L, 0L,
9L, 7L, 1L,
10L, 7L, 1L
)
df %>%
group_by(object) %>%
mutate(
sequence = data.table::rleid(positive == 1),
) %>%
filter(
positive == 1
) %>%
group_by(
object, sequence
) %>%
summarise(
length = n()
) %>%
summarise(
max = max(length),
mean = mean(length)
)
#> `summarise()` has grouped output by 'object'. You can override using the
#> `.groups` argument.
#> # A tibble: 2 × 3
#> object max mean
#> <int> <int> <dbl>
#> 1 6 4 3.5
#> 2 7 5 2.67
Created on 2022-07-26 by the reprex package (v2.0.1)
CodePudding user response:
I created my own data so the output won't be exactly what you showed. Nevertheless it should do the trick.
library(dplyr)
sat.seed(111)
df <- data.frame(frame=c(1:10,1:10),
object=rep(6:7, each=10),
positive=sample(0:1,20, replace=T))
df
frame object positive
1 1 6 1
2 2 6 1
3 3 6 1
4 4 6 0
5 5 6 1
6 6 6 0
7 7 6 0
8 8 6 0
9 9 6 1
10 10 6 1
11 1 7 1
12 2 7 0
13 3 7 1
14 4 7 0
15 5 7 0
16 6 7 1
17 7 7 0
18 8 7 0
19 9 7 0
20 10 7 1
df %>% group_by(object) %>% summarise(mean=mean(rle(positive)$lengths[rle(positive)$values==1]) ,
max=max(rle(positive)$lengths[rle(positive)$values==1]))
# A tibble: 2 × 3
object mean max
<int> <dbl> <int>
1 6 2 3
2 7 1 1