Home > Enterprise >  How to find peaks grouping by name
How to find peaks grouping by name

Time:06-28

I am trying to find peaks of exposure in a dataset that I have. I have been able to find them in a dataset for one person (variable name), but now I would like to find them applying a broup by (or related function) for each person I have in the dataset.

The dataset looks like this:

Name Time Exposure
1 20:30:01 10
1 20:30:02 0
1 20:30:03 13
2 20:30:01 2
2 20:30:02 5
2 20:30:03 1
3 20:30:01 10
3 20:30:02 11
3 20:30:03 12
... ... ...

The code to create an example of the dataset that I have:

#### Create dataset ####

time_initial <- seq(from = as.POSIXct("08:19:00", "%H:%M:%S", tz="UTC"), to = as.POSIXct("08:19:19", "%H:%M:%S", tz="UTC"), by = "1 sec")
time_min <- format(as.POSIXct(time_initial), format = '%H:%M:%S')
exposure_a <- c(0,0,2,3,5,0,0,4,5,6,0.5,0.25,0,0,0,0,4,5,0,0)
exposure_b <- c(1,2,3,6,0.5,0,0,0,0,0,0.5,0.25,0,0,0,0,0,0,0,0)
exposure_c <- c(0,0,0,0,0,0,9,4,0,0,0,0.25,0.75,0,0,0,4,7,8,0)
name_a <- rep("a", times = 20)
name_b <- rep("b", times = 20)
name_c <- rep("c", times = 20)
data_a <- as.data.frame(cbind(name = name_a, exposure = exposure_a))
data_b <- as.data.frame(cbind(name = name_b, exposure = exposure_b))
data_c <- as.data.frame(cbind(name = name_c, exposure = exposure_c))
data_abc <- rbind(data_a, data_b, data_c)
data_all <- data.frame(time_min = strftime(time_initial, format = "%H:%M:%S"),
                       data_abc)

The code to find peaks for the name "a":

#### Find peaks for name a ####

data <- data_all %>% 
  filter(name == "a")

#### Find peaks ####

# convert to data.table

setDT(data)

# identify groups 

data[, group := rleid(exposure >= 1)]

# get min/max of non-zero-exposure groups

peak_info <- data[exposure >= 1,
                  .(peak_height = max(exposure),
                    peak_start = min(time_min),
                    peak_end = max(time_min)), 
                  by = group]

To obtain the same output (peak_info) for all data without having to filter for each person, I have tried to add the variable name into the rleid, but it does not work.

If anyone knows how to do it, I would appreciate it a lot.

Thanks,

Miquel

CodePudding user response:

To expand on my comment:

setDT(data_all)

data_all[, group := rleid(exposure >= 1), by = name]

data_all[exposure >= 1,
                  .(peak_height = max(exposure),
                    peak_start = min(time_min),
                    peak_end = max(time_min)), 
                  by = list(name, group)]

#    name group peak_height peak_start peak_end
# 1:    a     2           5   09:19:02 09:19:04
# 2:    a     4           6   09:19:07 09:19:09
# 3:    a     6           5   09:19:16 09:19:17
# 4:    b     1           6   09:19:00 09:19:03
# 5:    c     2           9   09:19:06 09:19:07
# 6:    c     4           8   09:19:16 09:19:18
  •  Tags:  
  • r
  • Related