I have a data frame like this one:
ID CODE
1 code_12
1 code_67
2 code_76
3 code_34
I would like to put each "CODE" I have on my rows in a column by assigning a 1 for the subjects that have this code in row and 0 otherwise.
When the same subject has several lines (for example subject 1), I would like to repeat the "1" on all its lines, in order to obtain a df like this one:
ID CODE code_12 code_67 code_76 code_34
1 code_12 1 1 0 0
1 code_67 1 1 0 0
2 code_76 0 0 1 0
3 code_34 0 0 0 1
I managed to do it "code by code" in the way below, but I would like a way to automate this (with a loop or something) if possible as I have a lot of different codes
df$code_12<-ifelse(df$CODE=="12", "1", "0")
df<-df%>%
group_by(ID) %>% mutate(code_12 = max(code_12))
df$code_67<-ifelse(df$CODE== "67", "1", "0")
df<-df%>%
group_by(ID) %>% mutate(code_67 = max(code_67))
Thanks in advance for the help
CodePudding user response:
Using tidyr
, you can use pivot_wider
with a value column set to 1
library(tidyr)
df <- df %>%
mutate(value = 1) %>%
pivot_wider(names_from = CODE) %>%
mutate(across(everything(), ~replace(., is.na(.), 0) )) %>%
right_join(df, by = "ID") %>%
relocate(CODE, .after = ID)
Output
> df
# A tibble: 4 x 6
ID CODE code_12 code_67 code_76 code_34
<dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 1 code_12 1 1 0 0
2 1 code_67 1 1 0 0
3 2 code_76 0 0 1 0
4 3 code_34 0 0 0 1
CodePudding user response:
Here is a base R option using table
crossprod
> with(df, cbind(df, crossprod(table(df))[CODE, CODE]))
ID CODE code_12 code_67 code_76 code_34
code_12 1 code_12 1 1 0 0
code_67 1 code_67 1 1 0 0
code_76 2 code_76 0 0 1 0
code_34 3 code_34 0 0 0 1