Using If else to check multiple columns and create a new column based on the response for string res-CodePudding

I have the following dataset:

hairdf=data.frame(
id=c(1:4),
  typedad=c("straight*","curly"),
  colourdad=c("brown","black"),
  typemom=c("curly","wavy*"),
  colourmom=c("blonde","red"),
  typekid1=c("wavy","mixed*"),
  colourkid1=c("black","blonde"))

I want to create new columns that will look at hairtypes and give value 1 if the type of hair appears in "hairtype" columns without an asterisk and a value 2 if it appears with an asterisk (blank if it doesnt appear in that row). It should look like so:

id	typedad	colourdad	typemom	colourmom	typekid1	colourkid1	straight	curly	wavy	mixed
1	striaght*	brown	curly	blonde	wavy	black	2	1	1
2	curly	black	wavy*	red	mixed*	blonde		1	2	2

My two issues are that all other examples use numeric values and all other examples have the columns of interest located next to each other. I need code that looks to match strings in columns that can be located anywhere in the dataframe. I have tried the following:

straight<- hairdf %>% mutate(across(c("hairtypedad", "hairtypemom", "hairtypekid1"),
                                    ifelse(.=="straight", 1
                                             ifelse(.=="straight*",2, ""
                                             ))))
curly<- hairdf %>% mutate(across(c("hairtypedad", "hairtypemom", "hairtypekid1"),
                                        ifelse(.=="curly", 1
                                                 ifelse(.=="curly*",2, ""
 wavy<- hairdf %>% mutate(across(c("hairtypedad", "hairtypemom", "hairtypekid1"),
                                        ifelse(.=="wavy", 1
                                                 ifelse(.=="wavy*",2, ""
                                                 ))))      
mixed<- hairdf %>% mutate(across(c("hairtypedad", "hairtypemom", "hairtypekid1"),
                                        ifelse(.=="mixed", 1
                                                 ifelse(.=="mixed*",2, ""
                                                 ))))

But I'm not sure if this code even makes sense. Also, this will be tedious as I have way more hairtypes, so any suggestions to make it easier would be appreciated as well!! Thankyou!!!

CodePudding user response：

This is not the more efficient answer, neither the more general solution, but may satisfy a solution:

#create columns
st <- rep(NA,nrow(hairdf));
cur <- rep(NA,nrow(hairdf));
wav <- rep(NA,nrow(hairdf));
mix <- rep(NA,nrow(hairdf));

#join and define words
hairdf <- cbind(hairdf,st,cur,wav,mix);
words <- c("straight","curly","wavy","mixed");
words_ast <- paste(words,"*",sep=""); #just get the "*" words

#make a loop according to positions of columns st,cur,wav,mix
for (j in 1:length(words_ast)){ #let's see if we can evaluate 2 in words_ast
  for (i in c(2,3,4)){ #but only in columns we selected
    a <- subset(hairdf,hairdf[,i]==words_ast[j]) #subset columns which satisfay condition. [Note that this can be written as hairdf %>% subset(.[,i]==words_ast[j]) ]
    hairdf[row.names(a),7 j] <- 2 #replace value from column 8
  }
}
#repeat process for "words"

for (j in 1:length(words)){
  for (i in c(2,3,4)){
    a <- subset(hairdf,hairdf[,i]==words[j])
    hairdf[row.names(a),7 j] <- 1
  }
}

This should allow you to get the expected result. Alternatively, you can use the assign() function, i.e

assign(x,value=1)

where x is each element in words.

So in a loop:

assign(words[n],value=1) ; assign(words_ast[n],value=2)