Home > Blockchain >  Count months before reaching value
Count months before reaching value

Time:05-20

I want to calculate per id per row how many months there are before reaching the value 1 in a another colomn. I have a couple of thousands IDs en many months to work with.

id period sick
1 1-1-1990 0
1 1-2-1990 0
1 1-3-1990 1
1 1-4-1990 0
2 1-1-1990 0
2 1-2-1990 0
2 1-3-1990 1
2 1-4-1990 0
2 1-5-1990 0

and I would like to have an output like this

id period sick months_until
1 1-1-1990 0 2
1 1-2-1990 0 1 
1 1-3-1990 1 0 
1 1-4-1990 0 -1 
2 1-1-1990 0 2 
2 1-2-1990 0 1 
2 1-3-1990 1 0 
2 1-4-1990 0 -1 
2 1-5-1990 0 -2 

CodePudding user response:

A possible solution, based on dplyr:

library(dplyr)

df %>% 
  group_by(id) %>% 
  mutate(months_until = (which(sick == 1) - row_number())) %>% 
  ungroup

#> # A tibble: 9 × 4
#>      id period    sick months_until
#>   <int> <chr>    <int>        <int>
#> 1     1 1-1-1990     0            2
#> 2     1 1-2-1990     0            1
#> 3     1 1-3-1990     1            0
#> 4     1 1-4-1990     0           -1
#> 5     2 1-1-1990     0            2
#> 6     2 1-2-1990     0            1
#> 7     2 1-3-1990     1            0
#> 8     2 1-4-1990     0           -1
#> 9     2 1-5-1990     0           -2

CodePudding user response:

Using data.table:

setDT(d)[, months_until := month(as.Date(period, "%d-%m-%Y"))[ sick == 1 ] - seq_len(.N), by = id][]
#    id   period sick months_until
# 1:  1 1-1-1990    0            2
# 2:  1 1-2-1990    0            1
# 3:  1 1-3-1990    1            0
# 4:  1 1-4-1990    0           -1
# 5:  2 1-1-1990    0            2
# 6:  2 1-2-1990    0            1
# 7:  2 1-3-1990    1            0
# 8:  2 1-4-1990    0           -1
# 9:  2 1-5-1990    0           -2

CodePudding user response:

Here is another idea without using any libraries and utilising seq() by creating a custom function, i.e.

f1 <- function(x){
  i1 <- which(x == 1) - 1
  i2 <- i1 - (length(x) - 1)
  seq(i1, i2)
}

df$new <- with(df, ave(sick, id, FUN = f1))

> df
  id   period sick new
1  1 1-1-1990    0   2
2  1 1-2-1990    0   1
3  1 1-3-1990    1   0
4  1 1-4-1990    0  -1
5  2 1-1-1990    0   2
6  2 1-2-1990    0   1
7  2 1-3-1990    1   0
8  2 1-4-1990    0  -1
9  2 1-5-1990    0  -2
  •  Tags:  
  • r
  • Related