I need to concatenate strings by group in the dplyr, but the resulting column should account only for the previous columns, not the leading ones
I want my data to look like this:
ID | message | messages_used |
---|---|---|
1 | 53 | 53 |
1 | 54 | 53,54 |
1 | 55 | 53,54,55 |
2 | 53 | 53 |
2 | 58 | 53,58 |
Is it achievable using dplyr
only?
CodePudding user response:
You can use Reduce(..., accumulate = TRUE)
from base
:
library(dplyr)
df %>%
group_by(ID) %>%
mutate(messages_used = Reduce(\(x, y) paste(x, y, sep = ", "), message, accumulate = TRUE)) %>%
ungroup()
# # A tibble: 5 x 3
# ID message messages_used
# <int> <int> <chr>
# 1 1 53 53
# 2 1 54 53, 54
# 3 1 55 53, 54, 55
# 4 2 53 53
# 5 2 58 53, 58
CodePudding user response:
We can use dplyr::group_by()
and purrr::accumulate()
:
dat <- data.frame(ID = c(1,1,1,2,2), message = c(53,54,55,53,58))
library(dplyr)
library(purrr)
dat %>%
group_by(ID) %>%
mutate(message_used = accumulate(message, ~ paste(.x, .y, sep =",")))
#> # A tibble: 5 x 3
#> # Groups: ID [2]
#> ID message message_used
#> <dbl> <dbl> <chr>
#> 1 1 53 53
#> 2 1 54 53,54
#> 3 1 55 53,54,55
#> 4 2 53 53
#> 5 2 58 53,58
Created on 2022-05-11 by the reprex package (v2.0.1)