Home > OS >  Get all possible partitions of a set using Bell's number in R
Get all possible partitions of a set using Bell's number in R

Time:05-17

I want to find all the possible ways a number of categories can be similar to each other in R. The total number of combinations is calculated as Bell's number. E.g. there are 15 ways to organise 4 categories.

I am trying to write some R code that shows those 15 ways as binary vectors.

This first block of code shows there are four classes, each of which can take a the value of a, b, or c.

set.seed(123)
n_classes = 4
classes = LETTERS[1:n_classes]
state = sample(letters[1:3], size = n_classes, replace = TRUE)
# "c" "b" "b" "b"

Comparing each class to each other we can get a structural representation of this system:

comparisons = outer(state, state, "==") %>% 
  .[lower.tri(.)]
names(comparisons) = outer(classes, classes, paste0) %>% 
  .[lower.tri(.)]
# > comparisons
# BA    CA    DA    CB    DB    DC 
# TRUE  TRUE FALSE  TRUE FALSE FALSE 

You can see B and A are the same, and C and A are the same, so by deduction C and B are the same and all other comparisons are different.

We know this is one of 15 solutions for four categories - I want to know the other 14 in this binary format.

I first tried using expand.grid, but this doesnt account for the conditional structure of the categories (it just shows all the possibilities for each cell being 1 or 0)

# not correct
poss = rep(list(0:1), length(comparisons))
possibilities = expand.grid(poss)

I feel like there is a simple solution, but I am currently blind to it.

Thanks in advance.

CodePudding user response:

You can use the R package partitions to get all possible partitions of a set of categories.

library(partitions)

# the set to be partitioned
categories <- letters[1:4]

# the clusters defining the partition
clusters <- LETTERS[1:4]

categories |> length() |> listParts()
#> [[1]]
#> [1] (1,2,3,4)
#> 
#> [[2]]
#> [1] (1,2,4)(3)
#> 
#> [[3]]
#> [1] (1,2,3)(4)
#> 
#> [[4]]
#> [1] (1,3,4)(2)
#> 
#> [[5]]
#> [1] (2,3,4)(1)
#> 
#> [[6]]
#> [1] (1,4)(2,3)
#> 
#> [[7]]
#> [1] (1,2)(3,4)
#> 
#> [[8]]
#> [1] (1,3)(2,4)
#> 
#> [[9]]
#> [1] (1,4)(2)(3)
#> 
#> [[10]]
#> [1] (1,2)(3)(4)
#> 
#> [[11]]
#> [1] (1,3)(2)(4)
#> 
#> [[12]]
#> [1] (2,4)(1)(3)
#> 
#> [[13]]
#> [1] (2,3)(1)(4)
#> 
#> [[14]]
#> [1] (3,4)(1)(2)
#> 
#> [[15]]
#> [1] (1)(2)(3)(4)
categories |> length() |> setparts()
#>                                   
#> [1,] 1 1 1 1 2 1 1 1 1 1 1 2 2 2 1
#> [2,] 1 1 1 2 1 2 1 2 2 1 2 1 1 3 2
#> [3,] 1 2 1 1 1 2 2 1 3 2 1 3 1 1 3
#> [4,] 1 1 2 1 1 1 2 2 1 3 3 1 3 1 4
categories |>
  length() |>
  setparts() |>
  as.matrix() |>
  as.data.frame() |>
  lapply(function(x) sapply(x, function(y) clusters[[y]])) |>
  as.data.frame()
#>   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
#> 1  A  A  A  A  B  A  A  A  A   A   A   B   B   B   A
#> 2  A  A  A  B  A  B  A  B  B   A   B   A   A   C   B
#> 3  A  B  A  A  A  B  B  A  C   B   A   C   A   A   C
#> 4  A  A  B  A  A  A  B  B  A   C   C   A   C   A   D

Created on 2022-05-16 by the reprex package (v2.0.0)

  • Related