I have some telemetry data recording acoustic tag detections by a receiver. The data frame includes the unique tag code and the time of detection. Time is in Posix time (number of seconds since 00:00:00 UTC on 1 January 1970).
det <- data.frame(
code = c("34", "15", "34", "36", "15", "34", "36", "15", "34", "15", "15", "15", "34", "15", "34", "15", "34", "15", "34", "15", "34", "34", "15", "36"),
posix = c(1599655073, 1599655136, 1599655136, 1599655152, 1599655199, 1599655199, 1599655210, 1599655262, 1599655262, 1599655325, 1599655388, 1599655451, 1599655451, 1599655514, 1599655514, 1599655577, 1599655577, 1599655640, 1599655640, 1599655703, 1599655703, 1599655765, 1599655766, 1599655789)
)
I need to know how many times a tag was detected in each minute. The output should record the sequential minute from the beginning of the series, the tag code, and the number of occurrences in that minute. If a code did not appear during a minute it should get a zero.
The first six lines of the output should look like this:
min code freq
1 1 34 1
2 1 15 0
3 1 36 0
4 2 34 1
5 2 15 1
6 2 36 1
Thanks in advance.
CodePudding user response:
This is good task for tidyr::complete
: first convert the dates using the lubridate package, then fill the gaps with tidyr::complete
and create the freq
variable.
library(tidyr)
library(dplyr)
library(lubridate)
det %>%
mutate(posix = minute(as.POSIXlt(posix, origin = "1970-01-01")),
minute = posix - first(posix) 1) %>%
complete(minute, code) %>%
mutate(freq = complete.cases(posix)) %>%
select(-posix)
# A tibble: 39 x 3
minute code freq
<dbl> <chr> <int>
1 1 15 0
2 1 34 1
3 1 36 0
4 2 15 1
5 2 34 1
6 2 36 0
7 3 15 1
8 3 34 1
9 3 36 1
10 4 15 0
# ... with 29 more rows
CodePudding user response:
Converting minute
and code
to factors and specifying group_by(..., .drop = FALSE)
allows to get all the combinations of minute
and code
, even if not found in data. These will be assigned freq
0.
library(dplyr)
det |>
mutate(minute = 1 ((posix - min(posix)) %/% 60)) |>
mutate(minute = factor(minute, levels = seq(1, max(minute)))) |>
mutate(code = factor(code)) |>
group_by(minute, code, .drop = FALSE) |>
summarize(freq = n(), .groups = "drop")
# A tibble: 36 × 3
minute code freq
<fct> <fct> <int>
1 1 15 0
2 1 34 1
3 1 36 0
4 2 15 1
5 2 34 1
6 2 36 1
7 3 15 1
8 3 34 1
9 3 36 1
10 4 15 1
# … with 26 more rows