Suppose I have a data frame with n rows and k columns. I would like to have a list that contains unique n choose k 1 data frames. These data frames then has size k 1 rows and k columns. How do I do this in R? I know
combn()
but it works on vector, not matrix.
CodePudding user response:
If your data frame is called df
you could do:
apply(combn(nrow(df), ncol(df) 1), 2, function(i) df[i,])
For example:
df <- data.frame(x = 1:4, y = c('A', 'B', 'C', 'D'))
apply(combn(nrow(df), ncol(df) 1), 2, function(i) df[i,])
# [[1]]
# x y
# 1 1 A
# 2 2 B
# 3 3 C
#
# [[2]]
# x y
# 1 1 A
# 2 2 B
# 4 4 D
#
# [[3]]
# x y
# 1 1 A
# 3 3 C
# 4 4 D
#
# [[4]]
# x y
# 2 2 B
# 3 3 C
# 4 4 D
Just be aware that you will very quickly run into memory problems if your data frame has more than a few columns and a modest number of rows. For example, a data frame with just 4 columns and 50 rows will generate a list of over two million data frames here, and that will increase to 75 million with 100 rows. This is not a problem with the algorithm; it is just how many unique combinations there are.