Home > Net >  Paste columns after summarise in R dplyr
Paste columns after summarise in R dplyr

Time:09-29

I am trying after summarise a data.frame based on multiple columns to collapse the rest of the columns and paste them into the output. In the example, I am trying to paste back the price column but I fail

library(tidyverse)

df <- tibble(type=c("Jeep","Ferrari", "Fiat","Lancia", "Bentley"), 
             category=c("expensive", "expensive", "cheap","cheap","unapproachable"),
              price=c(100,100,20,20,1000))

df
#> # A tibble: 5 × 3
#>   type    category       price
#>   <chr>   <chr>          <dbl>
#> 1 Jeep    expensive        100
#> 2 Ferrari expensive        100
#> 3 Fiat    cheap             20
#> 4 Lancia  cheap             20
#> 5 Bentley unapproachable  1000

df  |> 
  group_by(category) |> 
  summarise(cars=toString(sort(unique(type)), .groups='drop'))
#> # A tibble: 3 × 2
#>   category       cars         
#>   <chr>          <chr>        
#> 1 cheap          Fiat, Lancia 
#> 2 expensive      Ferrari, Jeep
#> 3 unapproachable Bentley

Created on 2022-09-28 with reprex v2.0.2

I want my data to look like this

    category            cars         price
#> 1 cheap          Fiat, Lancia     20
#> 2 expensive      Ferrari, Jeep    100
#> 3 unapproachable Bentley          1000

CodePudding user response:

This should help. You can add columns to bring along in the summarize()

df  |> 
    group_by(category) |> 
    summarise(cars=toString(sort(unique(type)), .groups='drop'),
          price = first(price))

CodePudding user response:

> df  |> 
    group_by(category) |> 
    summarise(cars=toString(sort(unique(type)), .groups='drop')) |> 
    left_join(unique(df[,-1])  , by = "category")
# A tibble: 3 × 3
  category       cars          price
  <chr>          <chr>         <dbl>
1 cheap          Fiat, Lancia     20
2 expensive      Ferrari, Jeep   100
3 unapproachable Bentley        1000

CodePudding user response:

You could group by both category and price:

library(tidyverse)

df  |>
  group_by(category, price) |> 
  summarise(cars=toString(sort(unique(type))), .groups='drop')
#> # A tibble: 3 × 3
#> # Groups:   category [3]
#>   category       price cars         
#>   <chr>          <dbl> <chr>        
#> 1 cheap             20 Fiat, Lancia 
#> 2 expensive        100 Ferrari, Jeep
#> 3 unapproachable  1000 Bentley
  • Related