Home > Mobile >  How to apply a function to several columns listed in a vector in a function
How to apply a function to several columns listed in a vector in a function

Time:10-22

Within a function, I am trying to create an additional column to a data frame, which corresponds to the minimum of several other columns that are listed in the entry of the function.

A minimal data set would be:

C1 <- c(1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0)
C2 <- c(0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0)
C3 <- c(0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1)
C4 <- c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0)
Data <- data.frame(C1, C2, C3, C4)

If I want the minimum from C1, C2, and C4, outside a function, I would call:

Data$Min <- pmin(Data$C1, Data$C2, Data$C4)

Inside a function, however, I struggle and was only able to produce this:

min.col <- function(data, conditions){
                    data$Min <- pmin(data[[conditions]]) # [[ ]] is the wrong way to refer to the conditions, but I do not find how to

                    # After that, I go on here with my function based on the column data$Min but it is not relevant for the present problem.
}

To be called by:

min.col(data, conditions=c("C1", "C2", "C4"))

Anyone there to help? Many thanks in advance!

CodePudding user response:

These use only base R.

1) We can use do.call("pmin", ...) like this.

f <- function(data, cols) transform(data, min = do.call("pmin", data[cols]))
f(Data, c("C1", "C2", "C4"))

giving:

   C1 C2 C3 C4 min
1   1  0  0  0   0
2   0  1  1  0   0
3   1  1  0  0   0
4   1  1  0  1   1
5   0  0  0  0   0
6   0  1  0  0   0
7   1  0  1  0   0
8   1  0  1  1   0
9   0  0  1  0   0
10  0  1  0  0   0
11  0  1  1  0   0
12  1  1  1  1   1
13  1  0  0  0   0
14  0  1  0  0   0
15  0  0  1  0   0

2) or use apply

f2 <- function(data, cols) transform(data, min = apply(data[cols], 1, min))
f2(Data, c("C1", "C2", "C4"))

3) or Reduce

f3 <- function(data, cols) transform(data, min = Reduce(pmin, data[cols]))
f3(Data, c("C1", "C2", "C4"))

4) If data[cols] only has 0 and 1 cells then if we compute the number of 0's in a row then the minimum should be 1 if that sum is 0 and the minimum is 0 otherwise. Note that 0 is regarded as FALSE and any other number is regarded as TRUE when coerced to logical so:

f4 <- function(data, cols) transform(data, min =  !rowSums(!data[cols]))
f4(Data, c("C1", "C2", "C4"))

CodePudding user response:

We can do this pretty quickly using some of the functions from the tidyverse packages. The key here is not to use quotation marks, wrap the columns in vars, and then use the triple bang !!! to separate and evaluate in the function.

library(tidyverse)

min.col <- function(data, conditions){
  data %>%
    mutate(Min = pmin(!!!conditions))
}


min.col(Data, vars(C1, C2))
#>    C1 C2 C3 C4 Min
#> 1   1  0  0  0   0
#> 2   0  1  1  0   0
#> 3   1  1  0  0   1
#> 4   1  1  0  1   1
#> 5   0  0  0  0   0
#> 6   0  1  0  0   0
#> 7   1  0  1  0   0
#> 8   1  0  1  1   0
#> 9   0  0  1  0   0
#> 10  0  1  0  0   0
#> 11  0  1  1  0   0
#> 12  1  1  1  1   1
#> 13  1  0  0  0   0
#> 14  0  1  0  0   0
#> 15  0  0  1  0   0
  • Related