Suppose I have a time-series of number of cases generated by a transmission model, with each unique model labelled separately
trial.df <- data.frame(
model = rep(1:3, each =5),
time = rep(1:5, 3),
cases= rnorm(15, mean = 1, sd = 0.2)
)
In the real world, once you get to < 1 case, the disease should go extinct, but due to the way the model code works, you can still have cases > 1 even after they drop below 1. How can I force all values that come after a value < 1 to be zero?
I tried filtering the first time point that a zero appears for each model and then replacing all values that appear after that time point with zero, but it doesn't work with my current code
trial.df %>%
group_by(model) %>%
filter(floor(cases) == 0) %>%
slice(1) %>%
ungroup() -> rows
for (i in nrow(rows)){
trial.df %>%
group_by(model) %>%
mutate(cases2= case_when(
model == rows$model[i] & time >= rows$time[i] ~ 0,
TRUE ~ cases
)) %>% print(n = Inf)
}
I also tried using replace()
to do this, but this doesn't consider the new replaced values, only those in the original data
trial.df %>%
group_by(model) %>%
mutate(cases2 = replace(cases, floor(lag(cases)) == 0, 0))
I get the feeling this should be relatively straightforward, but can't seem to get it to work. Would really appreciate the advice! Thanks
CodePudding user response:
You can use dplyr::cumany
:
set.seed(42)
trial.df <- data.frame(
model = rep(1:3, each =5),
time = rep(1:5, 3),
cases= rnorm(15, mean = 1, sd = 0.2)
)
trial.df %>%
group_by(model) %>%
mutate(cases2 = if_else(cumany(cases < 1), 0, cases)) %>%
ungroup()
# # A tibble: 15 x 4
# model time cases cases2
# <int> <int> <dbl> <dbl>
# 1 1 1 1.27 1.27
# 2 1 2 0.887 0
# 3 1 3 1.07 0
# 4 1 4 1.13 0
# 5 1 5 1.08 0
# 6 2 1 0.979 0
# 7 2 2 1.30 0
# 8 2 3 0.981 0
# 9 2 4 1.40 0
# 10 2 5 0.987 0
# 11 3 1 1.26 1.26
# 12 3 2 1.46 1.46
# 13 3 3 0.722 0
# 14 3 4 0.944 0
# 15 3 5 0.973 0
CodePudding user response:
Thanks for that response Henrik, it worked brilliantly! Don't know why I can't see your answer now, so sharing it so others can learn from it.
Using the cummin()
function
x = c(2, 1, 0.5, 0, 1); x * cummin(x >= 1)
trial.df %>%
group_by(model) %>%
mutate(cases2= cases*cummin(cases>=1)) %>% print(n = Inf)
I was sure there was a simple solution to it. I should really understand the use of the cum..()
functions more! Thanks again!