Home > Net >  How to find all possible combinations of a given set of variables
How to find all possible combinations of a given set of variables

Time:11-05

I have a dataset with 6 variables:

Var1 <- c(1,0,1,0,1)
Var2 <- c(1,0,1,0,1)
Var3 <- c(1,1,1,0,1)
Var4 <- c(1,0,1,1,1)
Var5 <- c(1,0,0,0,1)
Var6 <- c(1,0,1,0,1)

DF <- data.frame(Var1, Var2, Var3, Var4, Var5, Var6)
DF

which results in

    Var1 Var2 Var3 Var4 Var5 Var6
1    1    1    1    1    1    1
2    0    0    1    0    0    0
3    1    1    1    1    0    1
4    0    0    0    1    0    0
5    1    1    1    1    1    1

I want to find all the possible variable-combinations, like how many 2 variable combinations (eg Var1Var2, Var2Var4, Var5Var4, etc...), 3 variable combinations, 4 ... etc. do I have. Is there a way to calculate this?

Thanks.

CodePudding user response:

Try this

> choose(length(DF), 2:length(DF))
[1] 15 20 15  6  1

or

> lapply(
    2:length(DF),
    combn,
    x = names(DF)
  )
[[1]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var2" "Var2" "Var2" "Var2" "Var3"
[2,] "Var2" "Var3" "Var4" "Var5" "Var6" "Var3" "Var4" "Var5" "Var6" "Var4"
     [,11]  [,12]  [,13]  [,14]  [,15]
[1,] "Var3" "Var3" "Var4" "Var4" "Var5"
[2,] "Var5" "Var6" "Var5" "Var6" "Var6"

[[2]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1"
[2,] "Var2" "Var2" "Var2" "Var2" "Var3" "Var3" "Var3" "Var4" "Var4" "Var5"
[3,] "Var3" "Var4" "Var5" "Var6" "Var4" "Var5" "Var6" "Var5" "Var6" "Var6"
     [,11]  [,12]  [,13]  [,14]  [,15]  [,16]  [,17]  [,18]  [,19]  [,20]
[1,] "Var2" "Var2" "Var2" "Var2" "Var2" "Var2" "Var3" "Var3" "Var3" "Var4"
[2,] "Var3" "Var3" "Var3" "Var4" "Var4" "Var5" "Var4" "Var4" "Var5" "Var5"
[3,] "Var4" "Var5" "Var6" "Var5" "Var6" "Var6" "Var5" "Var6" "Var6" "Var6"

[[3]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]   [,10]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1" "Var1"
[2,] "Var2" "Var2" "Var2" "Var2" "Var2" "Var2" "Var3" "Var3" "Var3" "Var4"
[3,] "Var3" "Var3" "Var3" "Var4" "Var4" "Var5" "Var4" "Var4" "Var5" "Var5"
[4,] "Var4" "Var5" "Var6" "Var5" "Var6" "Var6" "Var5" "Var6" "Var6" "Var6"
     [,11]  [,12]  [,13]  [,14]  [,15]
[1,] "Var2" "Var2" "Var2" "Var2" "Var3"
[2,] "Var3" "Var3" "Var3" "Var4" "Var4"
[3,] "Var4" "Var4" "Var5" "Var5" "Var5"
[4,] "Var5" "Var6" "Var6" "Var6" "Var6"

[[4]]
     [,1]   [,2]   [,3]   [,4]   [,5]   [,6]
[1,] "Var1" "Var1" "Var1" "Var1" "Var1" "Var2"
[2,] "Var2" "Var2" "Var2" "Var2" "Var3" "Var3"
[3,] "Var3" "Var3" "Var3" "Var4" "Var4" "Var4"
[4,] "Var4" "Var4" "Var5" "Var5" "Var5" "Var5"
[5,] "Var5" "Var6" "Var6" "Var6" "Var6" "Var6"

[[5]]
     [,1]
[1,] "Var1"
[2,] "Var2"
[3,] "Var3"
[4,] "Var4"
[5,] "Var5"
[6,] "Var6"

CodePudding user response:

Well, as in your case all variables are binary, the number of possible combinations given k number of variables is just: enter image description here

To calculate the number of combinations also for non-binary variables, you can use the function expand.grid and then count the number of rows. As you probably don't want to double count combinations, only count unique rows. Here is an easy example:

> library(dplyr)
> var1 <- c(1,2,2,3,5)
> var2 <- c(1,1,1,2,3)
> expand.grid(var1, var2) %>% unique %>% nrow
[1] 12
  • Related