I am trying to find peaks of exposure in a dataset that I have. I have been able to find them in a dataset for one person (variable name
), but now I would like to find them applying a broup by
(or related function) for each person I have in the dataset.
The dataset looks like this:
Name | Time | Exposure |
---|---|---|
1 | 20:30:01 | 10 |
1 | 20:30:02 | 0 |
1 | 20:30:03 | 13 |
2 | 20:30:01 | 2 |
2 | 20:30:02 | 5 |
2 | 20:30:03 | 1 |
3 | 20:30:01 | 10 |
3 | 20:30:02 | 11 |
3 | 20:30:03 | 12 |
... | ... | ... |
The code to create an example of the dataset that I have:
#### Create dataset ####
time_initial <- seq(from = as.POSIXct("08:19:00", "%H:%M:%S", tz="UTC"), to = as.POSIXct("08:19:19", "%H:%M:%S", tz="UTC"), by = "1 sec")
time_min <- format(as.POSIXct(time_initial), format = '%H:%M:%S')
exposure_a <- c(0,0,2,3,5,0,0,4,5,6,0.5,0.25,0,0,0,0,4,5,0,0)
exposure_b <- c(1,2,3,6,0.5,0,0,0,0,0,0.5,0.25,0,0,0,0,0,0,0,0)
exposure_c <- c(0,0,0,0,0,0,9,4,0,0,0,0.25,0.75,0,0,0,4,7,8,0)
name_a <- rep("a", times = 20)
name_b <- rep("b", times = 20)
name_c <- rep("c", times = 20)
data_a <- as.data.frame(cbind(name = name_a, exposure = exposure_a))
data_b <- as.data.frame(cbind(name = name_b, exposure = exposure_b))
data_c <- as.data.frame(cbind(name = name_c, exposure = exposure_c))
data_abc <- rbind(data_a, data_b, data_c)
data_all <- data.frame(time_min = strftime(time_initial, format = "%H:%M:%S"),
data_abc)
The code to find peaks for the name "a":
#### Find peaks for name a ####
data <- data_all %>%
filter(name == "a")
#### Find peaks ####
# convert to data.table
setDT(data)
# identify groups
data[, group := rleid(exposure >= 1)]
# get min/max of non-zero-exposure groups
peak_info <- data[exposure >= 1,
.(peak_height = max(exposure),
peak_start = min(time_min),
peak_end = max(time_min)),
by = group]
To obtain the same output (peak_info
) for all data without having to filter for each person, I have tried to add the variable name
into the rleid
, but it does not work.
If anyone knows how to do it, I would appreciate it a lot.
Thanks,
Miquel
CodePudding user response:
To expand on my comment:
setDT(data_all)
data_all[, group := rleid(exposure >= 1), by = name]
data_all[exposure >= 1,
.(peak_height = max(exposure),
peak_start = min(time_min),
peak_end = max(time_min)),
by = list(name, group)]
# name group peak_height peak_start peak_end
# 1: a 2 5 09:19:02 09:19:04
# 2: a 4 6 09:19:07 09:19:09
# 3: a 6 5 09:19:16 09:19:17
# 4: b 1 6 09:19:00 09:19:03
# 5: c 2 9 09:19:06 09:19:07
# 6: c 4 8 09:19:16 09:19:18