Home > Software engineering >  Running a function on each three sets of columns
Running a function on each three sets of columns

Time:10-19

In my dataframe, the three responses (yes, maybe, no) to a question are printed as three separate variables (a binary outcome of each possible response).

I want to combine the three binary responses into one variable, showing which response was selected.

The following piece of code does this:

data$var1 <- ifelse(data$var1.Yes, 0,
                     ifelse(data$var1.Maybe, 1, 
                            ifelse(data$var1.No,2, NA)))

However, because I have many variables (e.g., var1, var2, var3, etc..), I want to pass a function or loop where the code runs for multiple variables whose column names include ascending numbers.

I thought of the following function:

fun <- function(i){
  paste0("data$var", i) <- ifelse(paste0("data$var", i, ".Yes"), 0,
                       ifelse(paste0("data$var",i,".Maybe"), 1, 
                              ifelse(paste0("data$var",i,".No"),2, NA)))
}
fun(1:3)

Unfortunately, this does not work. How can I apply this function to several variables at once?

dput(test)

structure(list(var1.Yes = c(0, 0, 1, 0, 1, 1, 1, 0, NA, 1), 
               var1.Maybe = c(1, 0, 0, 1, 0, 0, 0, 0, NA, 0), 
               var1.No= c(0, 1, 0, 0, 0, 0, 0, 1, NA, 1),
               var2.Yes = c(0, 0, 1, NA, 1, 1, 1, 0, 0, 1), 
               var2.Maybe = c(0, 1, 1, NA, 0, 0, 0, 0, 0, 0), 
               var2.No= c(1, 0, 0, NA, 0, 0, 0, 1, 1, 0), 
               var3.Yes = c(0, 1, 0, 0, 0, 0, 0, NA, 0, 1),
               var3.Maybe = c(0, 0, 0, 0, 1, 1, 1, NA, 1, 0), 
          class = "data.frame"))

CodePudding user response:

You can loop through each three columns;

lapply(1:(ncol(test)/3), function(col) ifelse(test[,col*3-2], 0, 
                                              ifelse(test[,col*3-1], 1, 
                                                     ifelse(col*3, 2, NA))))

# [[1]]
#   [1]  1  2  0  1  0  0  0  2 NA  0
# 
# [[2]]
#   [1]  2  1  0 NA  0  0  0  2  2  0
# 
# [[3]]
#   [1]  2  0  2  2  1  1  1 NA  1  0

This can be merged with your data:

cbind(test, matrix(unlist(lapply_results), nrow = nrow(test)))

Data:

data.frame(
var1.Yes  = c(0, 0, 1, 0, 1, 1, 1, 0, NA, 1),
var1.Maybe= c(1, 0, 0, 1, 0, 0, 0, 0, NA, 0),
var1.No   = c(0, 1, 0, 0, 0, 0, 0, 1, NA, 1),
var2.Yes  = c(0, 0, 1, NA, 1, 1, 1, 0, 0, 1),
var2.Maybe= c(0, 1, 1, NA, 0, 0, 0, 0, 0, 0),
var2.No   = c(1, 0, 0, NA, 0, 0, 0, 1, 1, 0),
var3.Yes  = c(0, 1, 0, 0, 0, 0, 0, NA, 0, 1),
var3.Maybe= c(0, 0, 0, 0, 1, 1, 1, NA, 1, 0),
var3.No   = c(1, 0, 1, 1, 0, 0, 0, NA, 0, 0)) -> test
  • Related