Combinations of varying number of elements from two vectors-CodePudding

I have two vectors:

group1 <- c("a", "b")
group2 <- c("c", "d", "e")

I want get all combinations of one element from 'group1', with one or two elements from 'group2'.

The desired result is:

"a" "c" # one element from group1, one element from group 2
"b" "c"
"a" "d"
"b" "d"
"a" "e"
"b" "e"
"a" "c" "d" # one element from group1, two elements from group 2
"b" "c" "d"
"a" "c" "e"
"b" "c" "e"
"a" "d" "e"
"b" "d" "e"

CodePudding user response：

You can get the combinations of the second group like this:

g = c(as.list(group2), combn(group2,2,simplify=F))

and then use lapply() over these, each time using c() with the elements of group1:

unlist(lapply(g, \(i) lapply(group1,c,i)), recursive=F)

Output:

[[1]]
[1] "a" "c"

[[2]]
[1] "b" "c"

[[3]]
[1] "a" "d"

[[4]]
[1] "b" "d"

[[5]]
[1] "a" "e"

[[6]]
[1] "b" "e"

[[7]]
[1] "a" "c" "d"

[[8]]
[1] "b" "c" "d"

[[9]]
[1] "a" "c" "e"

[[10]]
[1] "b" "c" "e"

[[11]]
[1] "a" "d" "e"

[[12]]
[1] "b" "d" "e"

CodePudding user response：

A completely general solution that allows for any input vectors and the maximum / minimum numbers of characters taken from each would be:

comb <- function(a, b, min_a = 1, max_a = 1, min_b = 1, max_b = 2) {
  as <- do.call(c, lapply(min_a:max_a, \(i) combn(a, i, simplify = FALSE)))
  bs <- do.call(c, lapply(min_b:max_b, \(i) combn(b, i, simplify = FALSE)))
  apply(expand.grid(as, bs), 1, unlist, use.names = FALSE)
}

By default this would give:

comb(group1, group2)
#> [[1]]
#> [1] "a" "c"
#> 
#> [[2]]
#> [1] "b" "c"
#> 
#> [[3]]
#> [1] "a" "d"
#> 
#> [[4]]
#> [1] "b" "d"
#> 
#> [[5]]
#> [1] "a" "e"
#> 
#> [[6]]
#> [1] "b" "e"
#> 
#> [[7]]
#> [1] "a" "c" "d"
#> 
#> [[8]]
#> [1] "b" "c" "d"
#> 
#> [[9]]
#> [1] "a" "c" "e"
#> 
#> [[10]]
#> [1] "b" "c" "e"
#> 
#> [[11]]
#> [1] "a" "d" "e"
#> 
#> [[12]]
#> [1] "b" "d" "e"

^{Created on 2022-09-03 with reprex v2.0.2}

CodePudding user response：

Two calls of expand.grid will do this:

eg1 <- expand.grid(group1, group2)
eg1
#   Var1 Var2
# 1    a    c
# 2    b    c
# 3    a    d
# 4    b    d
# 5    a    e
# 6    b    e

and

eg2 <- expand.grid(group1, seq_len(ncol(combn(group2, 2))))
eg2 <- cbind(eg2, t(combn(group2, 2)[, eg2$Var2]))[,-2]
eg2
#   Var1 1 2
# 1    a c d
# 2    b c d
# 3    a c e
# 4    b c e
# 5    a d e
# 6    b d e

And then they can be combined with c. I'll unname them here for presentation purposes, though the call to unname is purely cosmetic.

out <- c(asplit(eg1, 1), asplit(eg2, 1))
out <- lapply(out, function(z) unname(c(z)))
str(out)
# List of 12
#  $ : chr [1:2] "a" "c"
#  $ : chr [1:2] "b" "c"
#  $ : chr [1:2] "a" "d"
#  $ : chr [1:2] "b" "d"
#  $ : chr [1:2] "a" "e"
#  $ : chr [1:2] "b" "e"
#  $ : chr [1:3] "a" "c" "d"
#  $ : chr [1:3] "b" "c" "d"
#  $ : chr [1:3] "a" "c" "e"
#  $ : chr [1:3] "b" "c" "e"
#  $ : chr [1:3] "a" "d" "e"
#  $ : chr [1:3] "b" "d" "e"

CodePudding user response：

This solution uses only base R and only one invocation of combn. It works by taking all combinations of 3 elements of c(group1, group2, "") and keeping only the ones that have one element from group1 and 1 or 2 from group2. The "" elements are removed leaving the desired list of vectors. We also show how to optionally represent this as a matrix.

Define function ok to be TRUE if there are ix occurrences of elements of g in h and FALSE otherwise. Then define g to return its argument without any "" components if there is one element from group1 and 1 or 2 from group2; otherwise, it returns NULL. Apply g to all combinations of 3 elements of c(group1, group2, ""). Then remove any zero length rows and return the list L. Use that if you want a list of character vectors as the result.

If, rather than a list result, a matrix result is wanted then use the last line as well.

ok <- function(g, h, ix) sum(g %in% h) %in% ix
g <- function(x) if (ok(x, group1, 1) && ok(x, group2, 1:2)) x[nchar(x) > 0]

L <- Filter(length, combn(c(group1, group2, ""), 3, g, simplify = FALSE))

# next line is only if you want a matrix result
do.call("rbind", lapply(L, `[`, 1:3))

giving this matrix:

      [,1] [,2] [,3]
 [1,] "a"  "c"  "d" 
 [2,] "a"  "c"  "e" 
 [3,] "a"  "c"  NA  
 [4,] "a"  "d"  "e" 
 [5,] "a"  "d"  NA  
 [6,] "a"  "e"  NA  
 [7,] "b"  "c"  "d" 
 [8,] "b"  "c"  "e" 
 [9,] "b"  "c"  NA  
[10,] "b"  "d"  "e" 
[11,] "b"  "d"  NA  
[12,] "b"  "e"  NA

This could also be expressed in terms of pipes like this. g is from above.

c(group1, group2, "z") |>
  combn(3, g, simplify = FALSE) |>
  Filter(f = length) |>
  lapply(`[`, 1:3) |> # this & next line if matrix wanted
  do.call(what = "rbind")

CodePudding user response：

We can use

a <- expand.grid(group1 , group2 , stringsAsFactors = F)
a$Var3 -> ""

b <- expand.grid(group1 , group2[-3] , group2[-1] ,
       stringsAsFactors = F) |>subset(Var2 != Var3)

rbind(a,b) |> unname() |> apply(1,list) |> unlist(F)

Output

[[1]]
[1] "a" "c"

[[2]]
[1] "b" "c"

[[3]]
[1] "a" "d"

[[4]]
[1] "b" "d"

[[5]]
[1] "a" "e"

[[6]]
[1] "b" "e"

$`1`
[1] "a" "c" "d"

$`2`
[1] "b" "c" "d"

$`5`
[1] "a" "c" "e"

$`6`
[1] "b" "c" "e"

$`7`
[1] "a" "d" "e"

$`8`
[1] "b" "d" "e"