Home > Mobile >  Count in how many different groups I can find an element in R dplyr
Count in how many different groups I can find an element in R dplyr

Time:02-22

My data looks like this

library(tidyverse)

df3 <- tibble(fruits=c("apple","banana","ananas","apple","ananas","apple","ananas"),
              position=c("135","135","135","136","137","138","138"), 
              counts = c(100,200,100,30,40,50,100))

df3
#> # A tibble: 7 × 3
#>   fruits position counts
#>   <chr>  <chr>     <dbl>
#> 1 apple  135         100
#> 2 banana 135         200
#> 3 ananas 135         100
#> 4 apple  136          30
#> 5 ananas 137          40
#> 6 apple  138          50
#> 7 ananas 138         100

Created on 2022-02-21 by the reprex package (v2.0.1)

I want to group_by fruits and count in which & in how many different positions each fruit belongs. I want my data to look like

fruits    groups    n_groups     sum_count
apple  135,136,138      3            180
banana      135         1            200
ananas 135,137,138      3            240

the groups column could be a list of characters. I do not care much about the structure.

Thank you for your time. Any guidance is appreciated.

CodePudding user response:

I' dont really understand what you want from your description, but you can accomplish wanted data.frame by grouping by fruits:

df3 %>% 
  group_by(fruits) %>% 
  summarise(groups = list(position), n_groups = n(), counts = sum(counts))

  fruits groups    n_groups counts
  <chr>  <list>       <int>  <dbl>
1 ananas <chr [3]>        3    240
2 apple  <chr [3]>        3    180
3 banana <chr [1]>        1    200

CodePudding user response:

Please find below, one other possibility using data.table

Reprex

  • Code
library(data.table)
library(tibble)


setDT(df3)[, .(groups = paste(position, collapse = ","), n_groups = .N, sum_count = sum(counts)) , by = .(fruits)]
  • Output
#>    fruits      groups n_groups sum_count
#> 1:  apple 135,136,138        3       180
#> 2: banana         135        1       200
#> 3: ananas 135,137,138        3       240

Created on 2022-02-21 by the reprex package (v2.0.1)

  • Related