I have a large data set in which i have a column which contains 5 different types of values. I want to create 5 different columns( 1 for each value) based on the column. Additionally, each row in those columns should contain either 1 or 0 based on whether another column contained that value For example suppose this is the data frame
l1=data.frame(c1= c("A","A","B","B","B","C"), c2 = c("Blue","Green","Red","yellow","Black","Blue"))
The output should be
l2=data.frame(c1=c("A","A","B","B","B","C"), Blue=c(1,0,0,0,0,1),Green=c(0,1,0,0,0,0),Red=c(0,0,1,0,0,0),Yellow=c(0,0,0,1,0,0),Black=c(0,0,0,0,1,0))
Thank you!
CodePudding user response:
library(tidyverse)
l1 %>%
mutate(n = 1, dummy = row_number()) %>%
pivot_wider(names_from = c2, values_from = n, values_fill = 0) %>%
select(-dummy)
Result
# A tibble: 6 × 6
c1 Blue Green Red yellow Black
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A 1 0 0 0 0
2 A 0 1 0 0 0
3 B 0 0 1 0 0
4 B 0 0 0 1 0
5 B 0 0 0 0 1
6 C 1 0 0 0 0
CodePudding user response:
Using model.matrix
you could do:
l2 <- as.data.frame(cbind(c1 = l1$c1, model.matrix(~ c2 - 1, l1)))
names(l2) <- gsub("^c2", "", names(l2))
l2
#> c1 Black Blue Green Red yellow
#> 1 A 0 1 0 0 0
#> 2 A 0 0 1 0 0
#> 3 B 0 0 0 1 0
#> 4 B 0 0 0 0 1
#> 5 B 1 0 0 0 0
#> 6 C 0 1 0 0 0