I have the following dataframe:
ID year level
1 2000 NA
1 2001 3
1 2002 3
1 2003 2
1 2004 1
2 2000 1
2 2001 3
2 2002 3
2 2003 3
2 2004 3
I want to update each value in "level" column by ID based on the previous one if the previous one is smaller.
the dataframe should look like this
ID year level
1 2000 NA
1 2001 3
1 2002 3
1 2003 2
1 2004 1
2 2000 1
2 2001 1
2 2002 1
2 2003 1
2 2004 1
I tried using shift from data table but it only changes one cell. I got this result
ID year level
1 2000 NA
1 2001 3
1 2002 3
1 2003 2
1 2004 1
2 2000 1
2 2001 1
2 2002 3
2 2003 3
2 2004 3
CodePudding user response:
library(data.table)
df <- data.frame(
ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L),
year = c(2000L,2001L,2002L,2003L,2004L,
2000L,2001L,2002L,2003L,2004L),
level = c(NA, 3L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 1L)
)
setDT(df)[!is.na(level), level := cummin(level), by = ID][]
#> ID year level
#> 1: 1 2000 NA
#> 2: 1 2001 3
#> 3: 1 2002 3
#> 4: 1 2003 2
#> 5: 1 2004 1
#> 6: 2 2000 1
#> 7: 2 2001 1
#> 8: 2 2002 1
#> 9: 2 2003 1
#> 10: 2 2004 1
Created on 2022-07-06 by the reprex package (v2.0.1)
CodePudding user response:
A tidyverse
solution with accumulate()
from purrr
:
library(tidyverse)
df %>%
group_by(ID) %>%
mutate(level2 = accumulate(level, min, na.rm = TRUE)) %>%
ungroup()
# # A tibble: 10 × 4
# ID year level level2
# <int> <int> <int> <int>
# 1 1 2000 NA NA
# 2 1 2001 3 3
# 3 1 2002 3 3
# 4 1 2003 2 2
# 5 1 2004 1 1
# 6 2 2000 1 1
# 7 2 2001 3 1
# 8 2 2002 3 1
# 9 2 2003 3 1
# 10 2 2004 3 1
CodePudding user response:
You can use
df$level <- Reduce(function(...) min(..., na.rm = T), df$level, accumulate = T)
to obtain your desired result:
ID year level
1 2000 NA
1 2001 3
1 2002 3
1 2003 2
1 2004 1
2 2000 1
2 2001 1
2 2002 1
2 2003 1
2 2004 1