Home > Net >  R: creating combinations of elements within a group and adding up numbers associated with combinatio
R: creating combinations of elements within a group and adding up numbers associated with combinatio

Time:12-03

I have the following dataset:

 Letter ID Number
   A    A1   1
   A    A2   2
   A    A3   3
   B    B1   1
   B    B2   2
   B    B3   3
   B    B4   4

My aim is first to create all possible combinations of IDs within the same "Letter" group. For example, for the letter A, it would be only three combinations: A1-A2,A2-A3,and A1-A3. The same IDs ordered differently don't count as a new combination, so for example A1-A2 is the same as A2-A1.

Then, within those combinations, I want to add up the numbers from the "Number" column associated with those IDs. So for the combination A1-A2, which are associated with 1 and 2 in the "Number" column, this would result in the number 1 2=3.

Finally, I want to place the ID combinations, added numbers and original Letter in a new data frame. Something like this:

Letter Combination Add.Number
   A    A1-A2        3
   A    A2-A3        5
   A    A1-A3        4
   B    B1-B2        3
   B    B2-B3        5
   B    B3-B4        7
   B    B1-B3        4
   B    B2-B4        6
   B    B1-B4        5

How can I do this in R, ideally using the package dplyr?

CodePudding user response:

library(dplyr)
  

letter <- c("A","A","A","B","B","B","B")

df <-
  data.frame(letter) %>% 
  group_by(letter) %>% 
  mutate(
    number = row_number(),
    id = paste0(letter,number)
  ) 

df %>% 
  full_join(df,by = "letter") %>% 
  filter(number.x < number.y) %>% 
  mutate(
    combination = paste0(id.x,"-",id.y),
    add_number = number.x   number.y) %>% 
  select(letter,combination,add_number)

# A tibble: 9 x 3
# Groups:   letter [2]
  letter combination add_number
  <chr>  <chr>            <int>
1 A      A1-A2                3
2 A      A1-A3                4
3 A      A2-A3                5
4 B      B1-B2                3
5 B      B1-B3                4
6 B      B1-B4                5
7 B      B2-B3                5
8 B      B2-B4                6
9 B      B3-B4                7

CodePudding user response:

In base R, using combn:

df <- data.frame(
  Letter = c("A","A","A","B","B","B","B"),
  Id = c("A1","A2","A3","B1","B2","B3","B4"),
  Number = c(1,2,3,1,2,3,4))

# combinations
l<-lapply(split(df$Id, df$Letter) ,function(x) 
  setNames(data.frame(t(combn(x,2))), c("L1","L2")))
n<-lapply(split(df$Number, df$Letter) ,function(x) 
  setNames(data.frame(t(combn(x,2))), c("N1","N2")))

# rbind all
result <- do.call(rbind, mapply(cbind, Letter=names(l), l, n, SIMPLIFY = F))
result$combination <- paste(result$L1, result$L2, sep="-")
result$sum = result$N1   result$N2
result
#>     Letter L1 L2 N1 N2 combination sum
#> A.1      A A1 A2  1  2       A1-A2   3
#> A.2      A A1 A3  1  3       A1-A3   4
#> A.3      A A2 A3  2  3       A2-A3   5
#> B.1      B B1 B2  1  2       B1-B2   3
#> B.2      B B1 B3  1  3       B1-B3   4
#> B.3      B B1 B4  1  4       B1-B4   5
#> B.4      B B2 B3  2  3       B2-B3   5
#> B.5      B B2 B4  2  4       B2-B4   6
#> B.6      B B3 B4  3  4       B3-B4   7
  • Related