Home > OS >  R CUMSUM When Value is Met
R CUMSUM When Value is Met

Time:05-17

DATA = data.frame(STUDENT=c(1,1,1,1,2,2,2,3,3,3,3,3),
T = c(1,2,3,4,1,2,3,1,2,3,4,5),
SCORE=c(NA,1,5,2,3,4,4,1,4,5,2,2),
WANT=c('N','N','P','P','N','N','N','N','N','P','P','P'))
)

I have 'DATA' and wish to create 'WANT' variable where is 'N' but within each 'STUDENT' when there is a score of '5' OR HIGHER than the 'WANT' value is 'P' and stays that way I seek a dplyr solutions

CodePudding user response:

You can use cumany:

library(dplyr)
DATA %>% 
  group_by(STUDENT) %>% 
  mutate(WANT2 = ifelse(cumany(ifelse(is.na(SCORE), 0, SCORE) == 5),
                        "N", "P"))

# A tibble: 12 × 5
# Groups:   STUDENT [3]
   STUDENT     T SCORE WANT  WANT2
     <dbl> <dbl> <dbl> <chr> <chr>
 1       1     1    NA N     N    
 2       1     2     1 N     N    
 3       1     3     5 P     P    
 4       1     4     2 P     P    
 5       2     1     3 N     N    
 6       2     2     4 N     N    
 7       2     3     4 N     N    
 8       3     1     1 N     N    
 9       3     2     4 N     N    
10       3     3     5 P     P    
11       3     4     2 P     P    
12       3     5     2 P     P    

CodePudding user response:

You can use cummax():

library(dplyr)

DATA %>%
  group_by(STUDENT) %>%
  mutate(WANT = c("N", "P")[cummax(SCORE >= 5 & !is.na(SCORE)) 1])

# A tibble: 12 × 4
# Groups:   STUDENT [3]
   STUDENT     T SCORE WANT 
     <dbl> <dbl> <dbl> <chr>
 1       1     1    NA N    
 2       1     2     1 N    
 3       1     3     5 P    
 4       1     4     2 P    
 5       2     1     3 N    
 6       2     2     4 N    
 7       2     3     4 N    
 8       3     1     1 N    
 9       3     2     4 N    
10       3     3     5 P    
11       3     4     2 P    
12       3     5     2 P
  • Related