I have a data frame that looks like this. Elements from the col1 are connected indirectly with elements in col2. for example 1 is connected with 2 and 3. and 2 is connected with 3. Therefore 1 should be connected with 3 as well.
library(tidyverse)
df1 <- tibble(col1=c(1,1,2,5,5,6),
col2=c(2,3,3,6,7,7))
df1
#> # A tibble: 6 × 2
#> col1 col2
#> <dbl> <dbl>
#> 1 1 2
#> 2 1 3
#> 3 2 3
#> 4 5 6
#> 5 5 7
#> 6 6 7
Created on 2022-03-15 by the reprex package (v2.0.1)
I want my data to look like this
#> col1 col2 col3
#> <dbl> <dbl>
#> 1 1 2 group1
#> 2 1 3 group1
#> 3 2 3 group1
#> 4 5 6 group2
#> 5 5 7 group2
#> 6 6 7 group2
I would appreciate any possible help to solve this riddle. Thank you for your time
CodePudding user response:
We may use igraph
library(igraph)
library(dplyr)
library(stringr)
g <- graph.data.frame(df1, directed = TRUE)
df1 %>%
mutate(col3 = str_c("group", clusters(g)$membership[as.character(col1)]))
-output
# A tibble: 6 × 3
col1 col2 col3
<dbl> <dbl> <chr>
1 1 2 group1
2 1 3 group1
3 2 3 group1
4 5 6 group2
5 5 7 group2
6 6 7 group2