Home > OS >  How to sort a concatenated string in a column in R?
How to sort a concatenated string in a column in R?

Time:07-09

Have a data frame with a concatenated column that I want to order numerically with the number after -

df <- data.frame(Order = c("A23_2-A27_3-A40_4-A10_1", "A25_2-A21_3-A11_1", "A9_1", "A33_2-A8_1"))

and want to have a result like this:

df <- data.frame(Order = c("A10A23A27A40", "A11A25A21", "A9", "A8A33"))

tried couple of things with tidyverse but couldn't get a clean result.

CodePudding user response:

df %>%
  rowid_to_column() %>%
  separate_rows(Order, sep='-') %>%
  separate(Order, c('Order', 'v'), convert = TRUE) %>%
  arrange(v)%>%
  group_by(rowid) %>%
  summarise(Order = str_c(Order, collapse = ''))
  
# A tibble: 4 x 2
  rowid Order       
  <int> <chr>       
1     1 A10A23A27A40
2     2 A11A25A21   
3     3 A9          
4     4 A8A33  

CodePudding user response:

Here is a base R option:

df$Order <-
  sapply(strsplit(df$Order, "-"), function(x)
    paste0(gsub("\\_.*", "", x[order(sub("^[^_]*_", "", x))]), collapse = ""))

Output

         Order
1 A10A23A27A40
2    A11A25A21
3           A9
4        A8A33

Or a tidyverse option:

library(tidyverse)

df %>%
  mutate(Order = map(str_split(Order, "-"), ~
                       str_c(
                         str_replace_all(.x[order(str_replace_all(.x, "^[^_]*_", ""))], "\\_.*", ""), collapse = ""
                       )))
  • Related