Home > front end >  How to loop a function over all elements of a vector except one and store the result in separate col
How to loop a function over all elements of a vector except one and store the result in separate col

Time:10-23

I have a data frame with several columns. I want to run a function [pmax() in this case] over all columns whose name is stored in a vector except one, and store the result in new separate columns. At the end, I would also like to store the names of all new columns in a separate vector. A minimal example would be:

Name <- c("Case 1", "Case 2", "Case 3", "Case 4", "Case 5")
C1 <- c(1, 0, 1, 1, 0)
C2 <- c(0, 1, 1, 1, 0)
C3 <- c(0, 1, 0, 0, 0)
C4 <- c(1, 1, 0, 1, 0)
Data <- data.frame(Name, C1, C2, C3, C4)

var.min <- function(data, col.names){
                    new.df <- data

                    # This is how I would do it outside a function and without loop:
                    new.df$max.def.col.exc.1 <- pmax(new.df$C2, new.df$C3)
                    new.df$max.def.col.exc.2 <- pmax(new.df$C1, new.df$C3)
                    new.df$max.def.col.exc.3 <- pmax(new.df$C1, new.df$C2)
                    
                    new.columns <- c("max.def.col.exc.1", "max.def.col.exc.2", "max.def.col.exc.3")

                    return(new.df)
}

new.df <- var.min(Data,
                  col.names= c("C1", "C2", "C3"))

The result should look like:

    Name C1 C2 C3 C4 max.def.col.exc.1 max.def.col.exc.2 max.def.col.exc.3
1 Case 1  1  0  0  1                 0                 1                 1
2 Case 2  0  1  1  1                 1                 1                 1
3 Case 3  1  1  0  0                 1                 1                 1
4 Case 4  1  1  0  1                 1                 1                 1
5 Case 5  0  0  0  0                 0                 0                 0

Anyone with an idea? Many thanks in advance!

CodePudding user response:

Here is a base R solution with combn. It gets all pairwise combinations of the column names and calls a function computing pmax.

Note that the order of the expected output columns is the same as the one output by the code below. If the columns vector is c("C1", "C2", "C3"), the order will be different.

Note also that the function is now a one-liner and accepts combinations of any number of columns, 2, 3 or more.

var.min <- function(cols, data) Reduce(pmax, data[cols])

cols <- c("C3", "C2", "C1")
combn(cols, 2, var.min, data = Data)
#     [,1] [,2] [,3]
#[1,]    0    1    1
#[2,]    1    1    1
#[3,]    1    1    1
#[4,]    1    1    1
#[5,]    0    0    0

Now it's just a matter of assigning column names and cbinding with the input data.

tmp <- combn(cols, 2, var.min, data = Data)
colnames(tmp) <- paste0("max.def.col.exc.", seq_along(cols))
Data <- cbind(Data, tmp)
rm(tmp)    # final clean-up
  • Related