I have the following dataset:
ID year start_year
a 1 1
a 2 1
a 3 1
b 1 2
b 2 2
b 3 2
c 1 3
c 2 3
c 3 3
And I want to create a new dummy column present
that, for each ID is, 1-1-1 if start_year is 1, is 0-1-1 if start_year is 2, and is 0-0-1 if start_year is 3.
My goal is to get the following table:
ID year start_year present
a 1 1 1
a 2 1 1
a 3 1 1
b 1 2 0
b 2 2 1
b 3 2 1
c 1 3 0
c 2 3 0
c 3 3 1
I guess this should be fairly easy for most of you, but I'm really stuck. Many thanks for your help!
CodePudding user response:
Easier option is to create a key/value list
and then subset the list
with the first
element of 'start_year' for each 'ID' (assuming there are only 3 elements per group)
library(dplyr)
lst1 <- list(`1` = c(1, 1, 1), `2` = c(0, 1, 1), `3` = c(0, 0, 1))
df1 %>%
group_by(ID) %>%
mutate(present = lst1[[as.character(first(start_year))]]) %>%
ungroup
-output
# A tibble: 9 × 4
ID year start_year present
<chr> <int> <int> <dbl>
1 a 1 1 1
2 a 2 1 1
3 a 3 1 1
4 b 1 2 0
5 b 2 2 1
6 b 3 2 1
7 c 1 3 0
8 c 2 3 0
9 c 3 3 1
data
df1 <- structure(list(ID = c("a", "a", "a", "b", "b", "b", "c", "c",
"c"), year = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), start_year = c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L)), class = "data.frame", row.names = c(NA,
-9L))
CodePudding user response:
A possible approach:
library(tidyverse)
df <- tribble(
~ID, ~year, ~start_year,
"a", 1, 1,
"a", 2, 1,
"a", 3, 1,
"b", 1, 2,
"b", 2, 2,
"b", 3, 2,
"c", 1, 3,
"c", 2, 3,
"c", 3, 3
)
df |> mutate(present = if_else(start_year <= year, 1, 0))
#> # A tibble: 9 × 4
#> ID year start_year present
#> <chr> <dbl> <dbl> <dbl>
#> 1 a 1 1 1
#> 2 a 2 1 1
#> 3 a 3 1 1
#> 4 b 1 2 0
#> 5 b 2 2 1
#> 6 b 3 2 1
#> 7 c 1 3 0
#> 8 c 2 3 0
#> 9 c 3 3 1
Created on 2022-05-27 by the reprex package (v2.0.1)