I have a table which looks like:
Order | Col A | Col B |
---|---|---|
1 | a | 2,3,4,5 |
2 | a | 3,5,6,7,8 |
3 | a | 1,2,4,9 |
4 | a | 3,5,7,11,12 |
I want to aggregate this table by Col A. The output should look like the following:
Order | Col A | Col B | Col C |
---|---|---|---|
1 | a | 2,3,4,5 | 2,3,4,5 |
2 | a | 3,5,6,7,8 | 2,3,4,5,6,7,8 |
3 | a | 1,2,4,9 | 1,2,3,4,5,6,7,8,9 |
4 | a | 3,5,7,11,12 | 1,2,3,4,5,6,7,8,9,11,12 |
Please guide me on how I get the desirable output in R?
CodePudding user response:
This ought to do it:
library(dplyr)
df %>%
group_by(ColA) %>%
mutate(
result = strsplit(Colb, split = ","),
result = lapply(result, as.numeric),
result = Reduce(f = union, x = result, accumulate = TRUE),
result = lapply(result, sort),
result = sapply(result, paste, collapse = ",")
) %>%
ungroup()
# # A tibble: 4 × 4
# Order ColA Colb result
# <int> <chr> <chr> <chr>
# 1 1 a 2,3,4,5 2,3,4,5
# 2 2 a 3,5,6,7,8 2,3,4,5,6,7,8
# 3 3 a 1,2,4,9 1,2,3,4,5,6,7,8,9
# 4 4 a 3,5,7,11,12 1,2,3,4,5,6,7,8,9,11,12
Using this data:
df = read.table(text = "Order ColA Colb
1 a '2,3,4,5'
2 a '3,5,6,7,8'
3 a '1,2,4,9'
4 a '3,5,7,11,12' ", header = T)
CodePudding user response:
df %>%
group_by(ColA)%>%
mutate(ColC = map_chr(accumulate(strsplit(ColB,','),
~union(.x,.y)), str_c, collapse=','))
# A tibble: 4 × 4
# Groups: ColA [1]
Order ColA ColB ColC
<int> <chr> <chr> <chr>
1 1 a 2,3,4,5 2,3,4,5
2 2 a 3,5,6,7,8 2,3,4,5,6,7,8
3 3 a 1,2,4,9 2,3,4,5,6,7,8,1,9
4 4 a 3,5,7,11,12 2,3,4,5,6,7,8,1,9,11,12