I have a dataframe that looks like this:
ID Region
1 a b c d
2 b c d e g
3 a c e f g h
.
.
.
I want to put it into a csv file that looks like this
ID a b c d e f g h
1 1 1 1 1 NA NA NA NA
2 NA 1 1 1 1 NA 1 NA
3 1 NA 1 NA 1 1 1 1
.
.
.
I can convert the dataframe into a csv file, but don't know how to format it this way. Any help would be greatly appreciated!
CodePudding user response:
If your data frame looks like this:
df <- data.frame(ID = 1:3, Region = c("a b c d", "b c d e g", "a c e f g h"))
df
#> ID Region
#> 1 1 a b c d
#> 2 2 b c d e g
#> 3 3 a c e f g h
And you have a vector of all possible regions you want to define:
all_regions <- c("a", "b", "c", "d", "e", "f", "g", "h")
Then you can do:
cbind(ID = df$ID,
setNames(as.data.frame(t(sapply(strsplit(df$Region, " "),
function(x) as.numeric(all_regions %in% x)))),
all_names))
#> ID a b c d e f g h
#> 1 1 1 1 1 1 0 0 0 0
#> 2 2 0 1 1 1 1 0 1 0
#> 3 3 1 0 1 0 1 1 1 1
CodePudding user response:
Using reshape2
library(reshape2)
df2=melt(strsplit(df$Region,""))
df2=df2[df2$value!=" ",]
df2=dcast(df2,L1~value)
ifelse(is.na(df2),NA,1)
L1 a b c d e f g h
[1,] 1 1 1 1 1 NA NA NA NA
[2,] 1 NA 1 1 1 1 NA 1 NA
[3,] 1 1 NA 1 NA 1 1 1 1