Home > Mobile >  Combination of three variables in R
Combination of three variables in R

Time:09-04

I have a data set

group1 <- c("a", "b")
group2 <- c("c", "d", "e")

I want get combination of that with: Min(group1) = 1; Max(group1) =1; Min(group2) = 1; Max(group2) =2;

So I can get this combination

- "a" "c"
- "b" "c"
- "a" "d"
- "b" "d"
- "a" "e"
- "b" "e"
- "a" "c" "d"
- "b" "c" "d"
- "a" "c" "e"
- "b" "c" "e"
- "a" "d" "e"
- "b" "d" "e"

Thank you guys

CodePudding user response:

You can get the combinations of the second group like this:

g = c(as.list(group2), combn(group2,2,simplify=F))

and then use lapply() over these, each time using c() with the elements of group1:

unlist(lapply(g, \(i) lapply(group1,c,i)), recursive=F)

Output:

[[1]]
[1] "a" "c"

[[2]]
[1] "b" "c"

[[3]]
[1] "a" "d"

[[4]]
[1] "b" "d"

[[5]]
[1] "a" "e"

[[6]]
[1] "b" "e"

[[7]]
[1] "a" "c" "d"

[[8]]
[1] "b" "c" "d"

[[9]]
[1] "a" "c" "e"

[[10]]
[1] "b" "c" "e"

[[11]]
[1] "a" "d" "e"

[[12]]
[1] "b" "d" "e"

CodePudding user response:

Two calls of expand.grid will do this:

eg1 <- expand.grid(group1, group2)
eg1
#   Var1 Var2
# 1    a    c
# 2    b    c
# 3    a    d
# 4    b    d
# 5    a    e
# 6    b    e

and

eg2 <- expand.grid(group1, seq_len(ncol(combn(group2, 2))))
eg2 <- cbind(eg2, t(combn(group2, 2)[, eg2$Var2]))[,-2]
eg2
#   Var1 1 2
# 1    a c d
# 2    b c d
# 3    a c e
# 4    b c e
# 5    a d e
# 6    b d e

And then they can be combined with c. I'll unname them here for presentation purposes, though the call to unname is purely cosmetic.

out <- c(asplit(eg1, 1), asplit(eg2, 1))
out <- lapply(out, function(z) unname(c(z)))
str(out)
# List of 12
#  $ : chr [1:2] "a" "c"
#  $ : chr [1:2] "b" "c"
#  $ : chr [1:2] "a" "d"
#  $ : chr [1:2] "b" "d"
#  $ : chr [1:2] "a" "e"
#  $ : chr [1:2] "b" "e"
#  $ : chr [1:3] "a" "c" "d"
#  $ : chr [1:3] "b" "c" "d"
#  $ : chr [1:3] "a" "c" "e"
#  $ : chr [1:3] "b" "c" "e"
#  $ : chr [1:3] "a" "d" "e"
#  $ : chr [1:3] "b" "d" "e"

CodePudding user response:

A completely general solution that allows for any input vectors and the maximum / minimum numbers of characters taken from each would be:

comb <- function(a, b, min_a = 1, max_a = 1, min_b = 1, max_b = 2) {
  as <- do.call(c, lapply(seq(min_a, max_a), function(i) asplit(combn(a, i), 2)))
  bs <- do.call(c, lapply(seq(min_b, max_b), function(i) asplit(combn(b, i), 2)))
  result <- apply(expand.grid(as, bs), 1, unlist)
  lapply(result, unname)
}

By default this would give:

comb(group1, group2)
#> [[1]]
#> [1] "a" "c"
#> 
#> [[2]]
#> [1] "b" "c"
#> 
#> [[3]]
#> [1] "a" "d"
#> 
#> [[4]]
#> [1] "b" "d"
#> 
#> [[5]]
#> [1] "a" "e"
#> 
#> [[6]]
#> [1] "b" "e"
#> 
#> [[7]]
#> [1] "a" "c" "d"
#> 
#> [[8]]
#> [1] "b" "c" "d"
#> 
#> [[9]]
#> [1] "a" "c" "e"
#> 
#> [[10]]
#> [1] "b" "c" "e"
#> 
#> [[11]]
#> [1] "a" "d" "e"
#> 
#> [[12]]
#> [1] "b" "d" "e"

Created on 2022-09-03 with reprex v2.0.2

  • Related