Home > front end >  Max between the adjoining and the value just above in R [Similar to RETAIN in SAS]
Max between the adjoining and the value just above in R [Similar to RETAIN in SAS]

Time:11-13

Sample data:

df <- data.frame(id = c('a','a','a','a','a','a','b','b','b','b'),
                 score = c(9,5,1,8,4,2,1,9,8,3))

df

Trying to create a max_score column which for each id is the max between the previous value (of the same column) and 'score' value in the same row.

In SAS this is possible by using RETAIN statement. Looking for a parallel in R.

enter image description here

Attempted following appraoch:

df %>% group_by(id) %>% mutate(first = row_number()) %>% ungroup() %>% 
        mutate(max_score = ifelse(first=1,score,max(score,lag(max_score))))

CodePudding user response:

I'm not sure I fully understand your explanation of the logic behind max_score, but the following reproduces your expected output

library(dplyr)
df %>%
    group_by(id) %>% 
    mutate(max_score = cummax(score)) %>%
    ungroup()
## A tibble: 10 × 3
#   id    score max_score
#   <chr> <dbl>     <dbl>
# 1 a         9         9
# 2 a         5         9
# 3 a         1         9
# 4 a         8         9
# 5 a         4         9
# 6 a         2         9
# 7 b         1         1
# 8 b         9         9
# 9 b         8         9
#10 b         3         9

I suggest thoroughly testing this on more use cases. If you run into an issue, please edit your main post to include additional sample data & matching expected output.

  • Related