Sample data:
df <- data.frame(id = c('a','a','a','a','a','a','b','b','b','b'),
score = c(9,5,1,8,4,2,1,9,8,3))
df
Trying to create a max_score column which for each id is the max between the previous value (of the same column) and 'score' value in the same row.
In SAS this is possible by using RETAIN statement. Looking for a parallel in R.
Attempted following appraoch:
df %>% group_by(id) %>% mutate(first = row_number()) %>% ungroup() %>%
mutate(max_score = ifelse(first=1,score,max(score,lag(max_score))))
CodePudding user response:
I'm not sure I fully understand your explanation of the logic behind max_score
, but the following reproduces your expected output
library(dplyr)
df %>%
group_by(id) %>%
mutate(max_score = cummax(score)) %>%
ungroup()
## A tibble: 10 × 3
# id score max_score
# <chr> <dbl> <dbl>
# 1 a 9 9
# 2 a 5 9
# 3 a 1 9
# 4 a 8 9
# 5 a 4 9
# 6 a 2 9
# 7 b 1 1
# 8 b 9 9
# 9 b 8 9
#10 b 3 9
I suggest thoroughly testing this on more use cases. If you run into an issue, please edit your main post to include additional sample data & matching expected output.