Home > Back-end >  dplyr find grouped minimum when column equals value
dplyr find grouped minimum when column equals value

Time:02-18

I have a df that looks as follow:

GROUP    DATE    VALUE    TIMEFRAME
  A     1/1/20    1           0
  A     1/2/20    0           0
  A     1/3/20   .5           1
  A     1/4/20   .3           1
  A     1/5/20    1           0
  B     1/3/20   .1           1
  B     1/4/20   .2           1
  B     1/5/20    1           0

What I want to do is calculate the minimum per group but only when TIMEFRAME == 1 so the output would be as follows:

GROUP    DATE    VALUE    TIMEFRAME   MIN
  A     1/1/20    1           0        .3
  A     1/2/20    0           0        .3
  A     1/3/20   .5           1        .3
  A     1/4/20   .3           1        .3
  A     1/5/20    1           0        .3
  B     1/3/20   .1           1        .1
  B     1/4/20   .2           1        .1
  B     1/5/20    1           0        .1

I have tried using filter() here but I need to keep all of the rows

CodePudding user response:

Within the mutate, subset the 'VALUE' based on the logical expression created with TIMEFRAME, get the min on those subset of values after grouping by 'GROUP'

library(dplyr)
df1 %>%
  group_by(GROUP) %>% 
  mutate(MIN = min(VALUE[TIMEFRAME == 1])) %>%
  ungroup

-output

# A tibble: 8 × 5
  GROUP DATE   VALUE TIMEFRAME   MIN
  <chr> <chr>  <dbl>     <int> <dbl>
1 A     1/1/20   1           0   0.3
2 A     1/2/20   0           0   0.3
3 A     1/3/20   0.5         1   0.3
4 A     1/4/20   0.3         1   0.3
5 A     1/5/20   1           0   0.3
6 B     1/3/20   0.1         1   0.1
7 B     1/4/20   0.2         1   0.1
8 B     1/5/20   1           0   0.1

data

df1 <- structure(list(GROUP = c("A", "A", "A", "A", "A", "B", "B", "B"
), DATE = c("1/1/20", "1/2/20", "1/3/20", "1/4/20", "1/5/20", 
"1/3/20", "1/4/20", "1/5/20"), VALUE = c(1, 0, 0.5, 0.3, 1, 0.1, 
0.2, 1), TIMEFRAME = c(0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L)), 
class = "data.frame", row.names = c(NA, 
-8L))
  • Related