Home > other >  Summarize number of occurrences per minute in R
Summarize number of occurrences per minute in R

Time:02-25

I have some telemetry data recording acoustic tag detections by a receiver. The data frame includes the unique tag code and the time of detection. Time is in Posix time (number of seconds since 00:00:00 UTC on 1 January 1970).

det <- data.frame(
code = c("34", "15", "34", "36", "15", "34", "36", "15", "34", "15", "15", "15", "34", "15", "34", "15", "34", "15", "34", "15", "34", "34", "15", "36"),
posix = c(1599655073, 1599655136, 1599655136, 1599655152, 1599655199, 1599655199, 1599655210, 1599655262, 1599655262, 1599655325, 1599655388, 1599655451, 1599655451, 1599655514, 1599655514, 1599655577, 1599655577, 1599655640, 1599655640, 1599655703, 1599655703, 1599655765, 1599655766, 1599655789)
) 

I need to know how many times a tag was detected in each minute. The output should record the sequential minute from the beginning of the series, the tag code, and the number of occurrences in that minute. If a code did not appear during a minute it should get a zero.

The first six lines of the output should look like this:

 min code freq
1   1   34    1
2   1   15    0
3   1   36    0
4   2   34    1
5   2   15    1
6   2   36    1

Thanks in advance.

CodePudding user response:

This is good task for tidyr::complete: first convert the dates using the lubridate package, then fill the gaps with tidyr::complete and create the freq variable.

library(tidyr)
library(dplyr)
library(lubridate)

det %>% 
  mutate(posix = minute(as.POSIXlt(posix, origin = "1970-01-01")),
         minute = posix - first(posix)   1) %>% 
  complete(minute, code) %>% 
  mutate(freq =  complete.cases(posix)) %>% 
  select(-posix)

# A tibble: 39 x 3
   minute code   freq
    <dbl> <chr> <int>
 1      1 15        0
 2      1 34        1
 3      1 36        0
 4      2 15        1
 5      2 34        1
 6      2 36        0
 7      3 15        1
 8      3 34        1
 9      3 36        1
10      4 15        0
# ... with 29 more rows

CodePudding user response:

Converting minute and code to factors and specifying group_by(..., .drop = FALSE) allows to get all the combinations of minute and code, even if not found in data. These will be assigned freq 0.

library(dplyr)

det |>
  mutate(minute = 1   ((posix - min(posix)) %/% 60)) |>
  mutate(minute = factor(minute, levels = seq(1, max(minute)))) |>
  mutate(code = factor(code)) |>
  group_by(minute, code, .drop = FALSE) |>
  summarize(freq = n(), .groups = "drop")

  # A tibble: 36 × 3
   minute code   freq
   <fct>  <fct> <int>
 1 1      15        0
 2 1      34        1
 3 1      36        0
 4 2      15        1
 5 2      34        1
 6 2      36        1
 7 3      15        1
 8 3      34        1
 9 3      36        1
10 4      15        1
# … with 26 more rows


  •  Tags:  
  • r
  • Related