Home > Net >  group tibble rows by unordered combination of columns
group tibble rows by unordered combination of columns

Time:04-08

Given the following tibble

tibble(sample = c(1:6),
       string = c("ABC","ABC","CBA","FED","DEF","DEF"),
       x = c("a","a","b","e","d","d"),
       y = c("b","b","a","d","e","e"))

# A tibble: 6 × 4
  sample string x     y    
   <int> <chr>  <chr> <chr>
1      1 ABC    a     b    
2      2 ABC    a     b    
3      3 CBA    b     a    
4      4 FED    e     d    
5      5 DEF    d     e    
6      6 DEF    d     e  

I would like to group rows by the unordered combination of columns x,y, then flip xy and reverse string in instances where x,y is inverted relative to the first row in the group. The desired output:

# A tibble: 6 × 5
  sample string x     y     group
   <int> <chr>  <chr> <chr> <dbl>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 FED    e     d         2
5      5 FED    e     d         2
6      6 FED    e     d         2

CodePudding user response:

strSort <- function(x) sapply(lapply(strsplit(x, NULL), sort), paste, collapse="")

dat %>% 
  group_by(group = data.table::rleid(strSort(string))) %>% 
  mutate(across(string:y, first))

# A tibble: 6 x 5
# Groups:   group [2]
  sample string x     y     group
   <int> <chr>  <chr> <chr> <int>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 FED    e     d         2
5      5 FED    e     d         2
6      6 FED    e     d         2

Previous answer

Here's a way using both tidyverse and apply approaches. First, sort the rows of columns x and y, then group_by x and y, create a cur_group_id and stri_reverse if necessary.

library(tidyverse)
library(stringi)

#Sort by row
dat[, c("x", "y")] <- t(apply(dat[, c("x", "y")], 1, sort))

dat %>% 
  group_by(x, y) %>% 
  mutate(group = cur_group_id(),
         string = ifelse(str_sub(string, 1, 1) == toupper(x), string, stri_reverse(string)))

# A tibble: 6 x 5
# Groups:   x, y [2]
  sample string x     y     group
   <int> <chr>  <chr> <chr> <int>
1      1 ABC    a     b         1
2      2 ABC    a     b         1
3      3 ABC    a     b         1
4      4 DEF    d     e         2
5      5 DEF    d     e         2
6      6 DEF    d     e         2
  • Related