Home > Net >  creating a dummy variable with consecutive cases
creating a dummy variable with consecutive cases

Time:10-15

I have a similar problem like this one: How can I create a dummy variable over consecutive values by group id?

the difference is: as soon I have the Dummy = 1 I want my dummy for the rest of my group (ID) beeing 1 since year is in descending order. So for example, out of df1:

 df1 <-data.frame(ID = rep(seq(1:3), each = 4),
           year = rep(c(2014, 2015, 2016, 2017),3),
           value = runif(12, min = 0, max = 25),
           Dummy = c(0,0,1,0 ,0,1,0,1, 1,0,0,0))

shall be :

df2 <- data.frame(ID = rep(seq(1:4), 3),
           year = rep(c(2014, 2015, 2016, 2017),3),
           value = runif(12, min = 0, max = 25),
           Dummy = c(0,0,1,1 ,0,1,1, 1, 1,1,1,1))

I've tried something like that (and some others) but that failed:

df2<- df1%>% group_by(ID) %>% arrange(ID , year) %>% 
                        mutate(treated  =  case_when(Dummy == 1 ~ 1,
                        lag(Dummy, n= unique(n()), default = 0) == 1 ~ 1))

CodePudding user response:

If your input data is as below then we can just use cummax():

library(dplyr)


df1 <-data.frame(ID = rep(seq(1:3), each = 4),
                 year = rep(c(2014, 2015, 2016, 2017),3),
                 value = runif(12, min = 0, max = 25),
                 Dummy = c(0,0,1,0 ,0,1,0,1, 1,0,0,0))
df1
#>    ID year     value Dummy
#> 1   1 2014 14.144996     0
#> 2   1 2015 20.621603     0
#> 3   1 2016  8.325170     1
#> 4   1 2017 21.725028     0
#> 5   2 2014 11.894383     0
#> 6   2 2015 13.445744     1
#> 7   2 2016  3.332338     0
#> 8   2 2017  2.984941     1
#> 9   3 2014 17.551266     1
#> 10  3 2015  5.250556     0
#> 11  3 2016 11.062577     0
#> 12  3 2017 20.169439     0


df1 %>%
  group_by(ID) %>%
  mutate(Dummy = cummax(Dummy))

#> # A tibble: 12 x 4
#> # Groups:   ID [3]
#>       ID  year value Dummy
#>    <int> <dbl> <dbl> <dbl>
#>  1     1  2014 14.1      0
#>  2     1  2015 20.6      0
#>  3     1  2016  8.33     1
#>  4     1  2017 21.7      1
#>  5     2  2014 11.9      0
#>  6     2  2015 13.4      1
#>  7     2  2016  3.33     1
#>  8     2  2017  2.98     1
#>  9     3  2014 17.6      1
#> 10     3  2015  5.25     1
#> 11     3  2016 11.1      1
#> 12     3  2017 20.2      1

Created on 2022-10-14 by the reprex package (v2.0.1)

  • Related