Home > front end >  How to use dplyr's contains() within paste() in R?
How to use dplyr's contains() within paste() in R?

Time:03-29

I have a long list of columns, and values from the score columns into one column. However, rather than typing them all out, I'd like to just use contains("test") within the paste function--is this possible?

Here's my data and what I want it to look like:

library(dplyr)

#What I have :(
test <- tibble(id = c(1:2),
               test_score = c(4,5),
               test_building = c("Lupton", "Hearst"),
               initials = c("s", "j"))

#What I want ^_^
answer <- tibble(id = c(1:2),
                test_score = c(4,5),
                test_building = c("Lupton", "Hearst"),
               initials = c("s", "j"),
               test_combo = c("4, Lupton", "5, Hearst"))

And here's a graveyard of some failed attempts that I've tried:

test %>% 
  mutate(test_combo = paste(vars(contains("test"))))

test %>% 
  mutate(test_combo = paste(across(contains("test"))))

I'd like the order to be test_score and then test_building, but order really isn't that important, so I'd take a simple solution that pastes them correctly in the 'wrong' order versus a very complex path to put them in the 'right' order.

CodePudding user response:

We could use paste with across combined with .names argument and unite function:

library(dplyr)
library(tidyr)

test %>% 
  mutate(across(contains("test"), ~paste(.), .names ="new_{.col}")) %>% 
  unite(test_combo, starts_with('new'), na.rm = TRUE, sep = ', ')

     id test_score test_building initials test_combo
  <int>      <dbl> <chr>         <chr>    <chr>     
1     1          4 Lupton        s        4, Lupton 
2     2          5 Hearst        j        5, Hearst 

as an alternative

as @AdroMine and @Martin Gal (credits to both of them) pointed out: using unite:

library(tidyr)
test %>% unite(test_combo, contains("test"), remove = FALSE, sep = ", ")
     id test_combo test_score test_building initials
  <int> <chr>           <dbl> <chr>         <chr>   
1     1 4, Lupton           4 Lupton        s       
2     2 5, Hearst           5 Hearst        j  

CodePudding user response:

A complicated solution could be:

library(tidyr)
library(dplyr)

test %>% 
  mutate(across(starts_with("test_"), as.character)) %>% 
  pivot_longer(starts_with("test_")) %>% 
  group_by(id, initials) %>% 
  summarise(test_combo = paste(value, collapse = ", "), .groups = "drop") %>% 
  right_join(test, by = c("id", "initials"))

This returns

# A tibble: 2 x 5
     id initials test_combo test_score test_building
  <int> <chr>    <chr>           <dbl> <chr>        
1     1 s        4, Lupton           4 Lupton       
2     2 j        5, Hearst           5 Hearst   

A simple way could be

test %>% 
  group_by(id) %>% 
  mutate(test_combo = paste(across(contains("test")), collapse = ", ")) %>% 
  ungroup()

which returns

# A tibble: 2 x 5
     id test_score test_building initials test_combo
  <int>      <dbl> <chr>         <chr>    <chr>     
1     1          4 Lupton        s        4, Lupton 
2     2          5 Hearst        j        5, Hearst 
  • Related