Home > Enterprise >  Concatenating strings with different separators depending on the number of occurrences
Concatenating strings with different separators depending on the number of occurrences

Time:11-03

Let's say I have the following data frame:

library(tidyverse)

names <- tibble::tribble(
  ~id, ~number,
  1, "232",
  2, "321",
  3, "473", 
  2, "367",
  3, "643",
  3, "565",
  3, "214",
)

I am writing an R Markdown document with a corresponding function that loops over the data frame and creates a report for each unique id in the data frame (in this case, 3 reports). In each report, I am looking to reference all instances in the number column for each unique id. I am thus looking for something to concatenate the strings in the number column. My end-goal is the below data frame:

names_expected <- tibble::tribble(
  ~id, ~number, ~numbers,
  1, "232", "232",
  2, "321", "321 and 367",
  3, "473", "473, 643, 565 and 214", 
  2, "367", "321 and 367",
  3, "643", "473, 643, 565 and 214",
  3, "565", "473, 643, 565 and 214",
  3, "214", "473, 643, 565 and 214"
)

Problems:

  • How do I mutate the numbers column such that the numbers column concatenates all numbers occuring for an id, i.e. the mutated column for the first occurrence where id == 2 should be "321 and 367" and not "321 and 321".
  • The separators in the concatenated string will differ depending on the number of strings that are concatenated:
  1. No separator is needed when the number of occurrences of the id is 1.
  2. When an id occurs twice the separator should be " and ".
  3. When an id occurs more than twice, the separator for the first n-1 occurrences should be ", " and " and " for the last occurrence.

Ideally looking for a {dplyr} solution.

Thanks!

CodePudding user response:

Does this work:

library(dplyr)
library(stringr)

names %>% group_by(id) %>% mutate(numbers = toString(number), numbers = ifelse(n() == 1, numbers, str_replace(numbers, ',(\\s)(\\d $)', '\\1and \\2')))
# A tibble: 7 x 3
# Groups:   id [3]
     id number numbers              
  <dbl> <chr>  <chr>                
1     1 232    232                  
2     2 321    321 and 367          
3     3 473    473, 643, 565 and 214
4     2 367    321 and 367          
5     3 643    473, 643, 565 and 214
6     3 565    473, 643, 565 and 214
7     3 214    473, 643, 565 and 214
  • Related