Within a function, I am trying to create an additional column to a data frame, which corresponds to the minimum of several other columns that are listed in the entry of the function.
A minimal data set would be:
C1 <- c(1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0)
C2 <- c(0, 1, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0)
C3 <- c(0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 0, 1)
C4 <- c(0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0)
Data <- data.frame(C1, C2, C3, C4)
If I want the minimum from C1, C2, and C4, outside a function, I would call:
Data$Min <- pmin(Data$C1, Data$C2, Data$C4)
Inside a function, however, I struggle and was only able to produce this:
min.col <- function(data, conditions){
data$Min <- pmin(data[[conditions]]) # [[ ]] is the wrong way to refer to the conditions, but I do not find how to
# After that, I go on here with my function based on the column data$Min but it is not relevant for the present problem.
}
To be called by:
min.col(data, conditions=c("C1", "C2", "C4"))
Anyone there to help? Many thanks in advance!
CodePudding user response:
These use only base R.
1) We can use do.call("pmin", ...)
like this.
f <- function(data, cols) transform(data, min = do.call("pmin", data[cols]))
f(Data, c("C1", "C2", "C4"))
giving:
C1 C2 C3 C4 min
1 1 0 0 0 0
2 0 1 1 0 0
3 1 1 0 0 0
4 1 1 0 1 1
5 0 0 0 0 0
6 0 1 0 0 0
7 1 0 1 0 0
8 1 0 1 1 0
9 0 0 1 0 0
10 0 1 0 0 0
11 0 1 1 0 0
12 1 1 1 1 1
13 1 0 0 0 0
14 0 1 0 0 0
15 0 0 1 0 0
2) or use apply
f2 <- function(data, cols) transform(data, min = apply(data[cols], 1, min))
f2(Data, c("C1", "C2", "C4"))
3) or Reduce
f3 <- function(data, cols) transform(data, min = Reduce(pmin, data[cols]))
f3(Data, c("C1", "C2", "C4"))
4) If data[cols] only has 0 and 1 cells then if we compute the number of 0's in a row then the minimum should be 1 if that sum is 0 and the minimum is 0 otherwise. Note that 0 is regarded as FALSE and any other number is regarded as TRUE when coerced to logical so:
f4 <- function(data, cols) transform(data, min = !rowSums(!data[cols]))
f4(Data, c("C1", "C2", "C4"))
CodePudding user response:
We can do this pretty quickly using some of the functions from the tidyverse
packages. The key here is not to use quotation marks, wrap the columns in vars
, and then use the triple bang !!!
to separate and evaluate in the function.
library(tidyverse)
min.col <- function(data, conditions){
data %>%
mutate(Min = pmin(!!!conditions))
}
min.col(Data, vars(C1, C2))
#> C1 C2 C3 C4 Min
#> 1 1 0 0 0 0
#> 2 0 1 1 0 0
#> 3 1 1 0 0 1
#> 4 1 1 0 1 1
#> 5 0 0 0 0 0
#> 6 0 1 0 0 0
#> 7 1 0 1 0 0
#> 8 1 0 1 1 0
#> 9 0 0 1 0 0
#> 10 0 1 0 0 0
#> 11 0 1 1 0 0
#> 12 1 1 1 1 1
#> 13 1 0 0 0 0
#> 14 0 1 0 0 0
#> 15 0 0 1 0 0