Home > Blockchain >  Summarise columns into a list and keep the rest of the columns in R dplyr
Summarise columns into a list and keep the rest of the columns in R dplyr

Time:10-28

I have a data that looks like this

library(tidyverse)

df = tibble(gene=c("geneA","geneB","geneC"),
            dat1=c(100,100,50),
            dat2=c(50,100,20),
            dat3=c(10,20,30))

df
#> # A tibble: 3 × 4
#>   gene   dat1  dat2  dat3
#>   <chr> <dbl> <dbl> <dbl>
#> 1 geneA   100    50    10
#> 2 geneB   100   100    20
#> 3 geneC    50    20    30

Created on 2022-10-27 with reprex v2.0.2

I want to summarise the values of all columns except the first one as a list in a new column while I am keeping the rest of the columns. I want my data to look like this

#> # A tibble: 3 × 4
#>   gene   dat1  dat2  dat3  data
#>   <chr> <dbl> <dbl> <dbl>  <list>
#> 1 geneA   100    50    10  100,50,10
#> 2 geneB   100   100    20  100,100,20
#> 3 geneC    50    20    30  50,20,30


CodePudding user response:

With rowwise and c_across:

df %>% 
  rowwise() %>% 
  mutate(data = list(c_across(-gene)))

   gene dat1 dat2 dat3         data
1 geneA  100   50   10  100, 50, 10
2 geneB  100  100   20 100, 100, 20
3 geneC   50   20   30   50, 20, 30

CodePudding user response:

Using pmap

library(dplyr)
library(purrr)
df %>%
   mutate(data = pmap(across(starts_with('dat')), c))

-output

# A tibble: 3 × 5
  gene   dat1  dat2  dat3 data     
  <chr> <dbl> <dbl> <dbl> <list>   
1 geneA   100    50    10 <dbl [3]>
2 geneB   100   100    20 <dbl [3]>
3 geneC    50    20    30 <dbl [3]>
  • Related