Home > OS >  Replacing all values in a column after a value < 1 with zeros in R
Replacing all values in a column after a value < 1 with zeros in R

Time:10-31

Suppose I have a time-series of number of cases generated by a transmission model, with each unique model labelled separately

trial.df <- data.frame(
model = rep(1:3, each =5),
time = rep(1:5, 3),
cases= rnorm(15, mean = 1, sd = 0.2)
)

In the real world, once you get to < 1 case, the disease should go extinct, but due to the way the model code works, you can still have cases > 1 even after they drop below 1. How can I force all values that come after a value < 1 to be zero?
I tried filtering the first time point that a zero appears for each model and then replacing all values that appear after that time point with zero, but it doesn't work with my current code

trial.df %>% 
  group_by(model) %>% 
  filter(floor(cases) == 0) %>% 
  slice(1) %>% 
  ungroup() -> rows

for (i in nrow(rows)){
trial.df %>% 
  group_by(model) %>% 
  mutate(cases2= case_when(
    model == rows$model[i] & time >= rows$time[i] ~ 0,
    TRUE ~ cases
  )) %>% print(n = Inf)
}

I also tried using replace() to do this, but this doesn't consider the new replaced values, only those in the original data

trial.df %>% 
  group_by(model) %>% 
  mutate(cases2 = replace(cases, floor(lag(cases)) == 0, 0))

I get the feeling this should be relatively straightforward, but can't seem to get it to work. Would really appreciate the advice! Thanks

CodePudding user response:

You can use dplyr::cumany:

set.seed(42)
trial.df <- data.frame(
model = rep(1:3, each =5),
time = rep(1:5, 3),
cases= rnorm(15, mean = 1, sd = 0.2)
)

trial.df %>%
  group_by(model) %>%
  mutate(cases2 = if_else(cumany(cases < 1), 0, cases)) %>%
  ungroup()
# # A tibble: 15 x 4
#    model  time cases cases2
#    <int> <int> <dbl>  <dbl>
#  1     1     1 1.27    1.27
#  2     1     2 0.887   0   
#  3     1     3 1.07    0   
#  4     1     4 1.13    0   
#  5     1     5 1.08    0   
#  6     2     1 0.979   0   
#  7     2     2 1.30    0   
#  8     2     3 0.981   0   
#  9     2     4 1.40    0   
# 10     2     5 0.987   0   
# 11     3     1 1.26    1.26
# 12     3     2 1.46    1.46
# 13     3     3 0.722   0   
# 14     3     4 0.944   0   
# 15     3     5 0.973   0   

CodePudding user response:

Thanks for that response Henrik, it worked brilliantly! Don't know why I can't see your answer now, so sharing it so others can learn from it.

Using the cummin() function

x = c(2, 1, 0.5, 0, 1); x * cummin(x >= 1)

trial.df %>% 
  group_by(model) %>% 
  mutate(cases2= cases*cummin(cases>=1)) %>% print(n = Inf)

I was sure there was a simple solution to it. I should really understand the use of the cum..() functions more! Thanks again!

  •  Tags:  
  • r
  • Related