Let's say I have the following data frame:
library(tidyverse)
names <- tibble::tribble(
~id, ~number,
1, "232",
2, "321",
3, "473",
2, "367",
3, "643",
3, "565",
3, "214",
)
I am writing an R Markdown document with a corresponding function that loops over the data frame and creates a report for each unique id
in the data frame (in this case, 3 reports). In each report, I am looking to reference all instances in the number
column for each unique id
. I am thus looking for something to concatenate the strings in the number
column. My end-goal is the below data frame:
names_expected <- tibble::tribble(
~id, ~number, ~numbers,
1, "232", "232",
2, "321", "321 and 367",
3, "473", "473, 643, 565 and 214",
2, "367", "321 and 367",
3, "643", "473, 643, 565 and 214",
3, "565", "473, 643, 565 and 214",
3, "214", "473, 643, 565 and 214"
)
Problems:
- How do I mutate the
numbers
column such that thenumbers
column concatenates allnumber
s occuring for anid
, i.e. the mutated column for the first occurrence whereid == 2
should be"321 and 367"
and not"321 and 321"
. - The separators in the concatenated string will differ depending on the number of strings that are concatenated:
- No separator is needed when the number of occurrences of the
id
is 1. - When an
id
occurs twice the separator should be" and "
. - When an
id
occurs more than twice, the separator for the firstn-1
occurrences should be", "
and" and "
for the last occurrence.
Ideally looking for a {dplyr}
solution.
Thanks!
CodePudding user response:
Does this work:
library(dplyr)
library(stringr)
names %>% group_by(id) %>% mutate(numbers = toString(number), numbers = ifelse(n() == 1, numbers, str_replace(numbers, ',(\\s)(\\d $)', '\\1and \\2')))
# A tibble: 7 x 3
# Groups: id [3]
id number numbers
<dbl> <chr> <chr>
1 1 232 232
2 2 321 321 and 367
3 3 473 473, 643, 565 and 214
4 2 367 321 and 367
5 3 643 473, 643, 565 and 214
6 3 565 473, 643, 565 and 214
7 3 214 473, 643, 565 and 214