Home > OS >  R: Compute minimum value of column with last time frame
R: Compute minimum value of column with last time frame

Time:07-14

I have this dataframe

df <- tibble(time = c(as.POSIXct("2012-01-01 00:00:00"), as.POSIXct("2012-01-02 00:00:00"), as.POSIXct("2012-01-04 00:00:00")),
             value = c(0, 0.1, 0.2))
  time                value
  <dttm>              <dbl>
1 2012-01-01 00:00:00   0  
2 2012-01-02 00:00:00   0.1
3 2012-01-04 00:00:00   0.2

What Is a nice way to compute the minimum value within the last 48 hours for each row. My way works but I'm sure there is a more clever approach:

df <- tibble(time = c(as.POSIXct("2012-01-01 00:00:00"), as.POSIXct("2012-01-02 00:00:00"), as.POSIXct("2012-01-04 00:00:00")),
             value = c(0, 0.1, 0.2)) %>%
  mutate(min_v_48h = purrr::pmap_dbl(
    .l = list(time),
    .f = function(upper_bound) {
      lower_bound <- upper_bound - (48 * 60 * 60)
      valid <- .data[["value"]][.data[["time"]] >= lower_bound & .data[["time"]] < upper_bound]
      ifelse(length(valid) > 0, min(valid), Inf)
    })) 
 time                value min_v_48h
  <dttm>              <dbl>     <dbl>
1 2012-01-01 00:00:00   0       Inf  
2 2012-01-02 00:00:00   0.1       0  
3 2012-01-04 00:00:00   0.2       0.1

CodePudding user response:

The runner package is excellent for this, particularly with times.

library(dplyr)

df <- tibble(time = c(as.POSIXct("2012-01-01 00:00:00"), as.POSIXct("2012-01-02 00:00:00"), as.POSIXct("2012-01-04 00:00:00")),
             value = c(0, 0.1, 0.2))



library(runner)


df %>%
  mutate(
    min_v_48h = runner(
      x= value,
      idx = time,
      k="2 days",
      f = min,
      lag = 1
    )
  )
#> Warning in f(.this_window, ...): no non-missing arguments to min; returning Inf
#> # A tibble: 3 × 3
#>   time                value min_v_48h
#>   <dttm>              <dbl>     <dbl>
#> 1 2012-01-01 00:00:00   0       Inf  
#> 2 2012-01-02 00:00:00   0.1       0  
#> 3 2012-01-04 00:00:00   0.2       0.1

Created on 2022-07-13 by the reprex package (v2.0.1)

CodePudding user response:

What about using map2 (also purrr) and lubridate instead:

library(tidyverse)

df |>
  mutate(min_v_48h = map2_dbl(timex, timex-days(2), ~ min(value[timex >= .y & timex < .x])))

timex <= .x if you want to include the current time in the interval.

Output:

# A tibble: 3 × 3
  timex               value min_v_48h
  <dttm>              <dbl>     <dbl>
1 2012-01-01 00:00:00   0       Inf  
2 2012-01-02 00:00:00   0.1       0  
3 2012-01-04 00:00:00   0.2       0.1
Warning message:
Problem while computing `min_v_48h = map2_dbl(...)`.
ℹ no non-missing arguments to min; returning Inf 

I.e. you might want to add your validation step.

  •  Tags:  
  • r
  • Related