Have a data frame with a concatenated column that I want to order numerically with the number after -
df <- data.frame(Order = c("A23_2-A27_3-A40_4-A10_1", "A25_2-A21_3-A11_1", "A9_1", "A33_2-A8_1"))
and want to have a result like this:
df <- data.frame(Order = c("A10A23A27A40", "A11A25A21", "A9", "A8A33"))
tried couple of things with tidyverse but couldn't get a clean result.
CodePudding user response:
df %>%
rowid_to_column() %>%
separate_rows(Order, sep='-') %>%
separate(Order, c('Order', 'v'), convert = TRUE) %>%
arrange(v)%>%
group_by(rowid) %>%
summarise(Order = str_c(Order, collapse = ''))
# A tibble: 4 x 2
rowid Order
<int> <chr>
1 1 A10A23A27A40
2 2 A11A25A21
3 3 A9
4 4 A8A33
CodePudding user response:
Here is a base R option:
df$Order <-
sapply(strsplit(df$Order, "-"), function(x)
paste0(gsub("\\_.*", "", x[order(sub("^[^_]*_", "", x))]), collapse = ""))
Output
Order
1 A10A23A27A40
2 A11A25A21
3 A9
4 A8A33
Or a tidyverse
option:
library(tidyverse)
df %>%
mutate(Order = map(str_split(Order, "-"), ~
str_c(
str_replace_all(.x[order(str_replace_all(.x, "^[^_]*_", ""))], "\\_.*", ""), collapse = ""
)))