I have character data like these:
a<-"cat,hammer,green"
b<-"hammer,green"
c<-"cat,hammer,green"
d<-"cat, green"
e<-"green,cat"
f<-"hammer"
df<-data.frame(Col1=rbind(a,b,c,d,e))
df<-as.data.frame(str_split(df$Col1,",",simplify=TRUE))
df
The order of the columns SHOULD be cat, hammer green; however, my data have missing values and in some cases, the animal-tool-color column order is mixed up. How can I define the correct order and then get my dataframe to have animals, tools and color in the proper column, and with NA values as shown below?
cat hammer green
NA hammer green
cat hammer green
cat NA green
cat NA green
NA hammer NA
CodePudding user response:
Here's a way:
a<-"cat,hammer,green"
b<-"hammer,green"
c<-"cat,hammer,green"
d<-"cat, green"
e<-"green,cat"
f<-"hammer"
df<-data.frame(Col1=rbind(a,b,c,d,e,f))
cond <- t(sapply(strsplit(df$Col1,","), function(x) c("cat", "hammer", "green") %in% x))
df2 <- data.frame(animal="cat", tool = "hammer", color = "green")[rep(1, nrow(df)),]
df2[!cond] <- NA
df2
#> animal tool color
#> 1 cat hammer green
#> 1.1 <NA> hammer green
#> 1.2 cat hammer green
#> 1.3 cat <NA> <NA>
#> 1.4 cat <NA> green
#> 1.5 <NA> hammer <NA>