Home > Blockchain >  How can I find all the possible combinations of elements in column vector and cross check them from
How can I find all the possible combinations of elements in column vector and cross check them from

Time:10-18

I have 2 data frames of a single column

data frame A:

var
A
B

and data frame B:

var
A
C
library(tidyverse) 
A = tibble(var = c("A","B"))
B = tibble(var = c("A","C"))

I want to find all the possible combinations of the elements in the var column in a new data frame that must look like this :

varA varB
A A
A B
A C
B C

Now from this resulted data frame I want to find this combinations in another data frame no matter the position (first or second,varA - varB) of the new data frame and give me the column y with the corresponded values.

varA varB y
B A 1.12
A A 1.14
C A 1.15
B C 1.16
D F 1.23
H G 1.1
D = tibble(varA = c("B","A","C","B","D","H"),
           varB = c("A","A","A","C","F","G"),
           y = c(1.12,1.14,1.15,1.16,1.23,1.10));D

ideally I want to look like this :

varA varB y
A A 1.14
A B 1.12
A C 1.15
B C 1.16

How can I do this in R using dplyr package or base functions ?

CodePudding user response:

Here is my attempt.

library(tidyverse) 
A = tibble(var = c("A","B"))
B = tibble(var = c("A","C"))

# create data frame of all combinations 
B <- expand.grid(A$var, B$var)

D <- tibble(varA = c("B","A","C","B","D","H"),
            varB = c("A","A","A","C","F","G"),
            y = c(1.12,1.14,1.15,1.16,1.23,1.10));D

# create new column and sort 

strSort <- function(x)
  sapply(lapply(strsplit(x, NULL), sort), paste, collapse = "")

D <- D %>% mutate(var_new = strSort(paste(varA, varB)))
B <- B %>% mutate(var_new = strSort(paste(Var1, Var2)))


left_join(B, D) %>% select(Var1, Var2, y)

I use expand.grid to create all possible combination, and then I create a new column for both data frames with the sorted string (sorting function), to make sure the order does not matter, then it is a simple left join and cleaning the resulting data frame.

  • Related