Home > Software engineering >  group by two sets of vars in a function
group by two sets of vars in a function

Time:12-01

I'm using the sample dataset below:

mytable <- read.table(text=
                        "group team num  ID
1   a   x    1    9
2   a   x    2    4
3   a   y    3    5
4   a   y    4    9
5   b   x    1    7
6   b   y    4    4
7   b   x    3    9
8   b   y    2    8",
                      header = TRUE, stringsAsFactors = FALSE)

I want to create separate data frames for each set of variables that I want to group by, I also want to group by two variables as well... I'm not sure how to do that. For example, I want a separate dataframe that groups the data by both team and ID as well... how do I do that?

library(dplyr)

lapply(c("group","team","ID",c("team","ID")), function(x){
  group_by(mytable,across(c(x,num)))%>%summarise(Count = n()) %>% mutate(new=x)%>% as.data.frame()
})

CodePudding user response:

See if this is what you want.

library(dplyr)

cols <- list("group","team","ID", c("team","ID"))

lapply(cols, function(x, dat = mytable){
  dat2 <- dat %>%
    group_by(across({{x}})) %>% 
    summarise(Count = n()) %>% 
    mutate(new = toString(x)) %>% 
    as.data.frame()
  return(dat2)
})

# `summarise()` has grouped output by 'team'. You can override using the `.groups` argument.
# [[1]]
#   group Count   new
# 1     a     4 group
# 2     b     4 group
# 
# [[2]]
#   team Count  new
# 1    x     4 team
# 2    y     4 team
# 
# [[3]]
#   ID Count new
# 1  4     2  ID
# 2  5     1  ID
# 3  7     1  ID
# 4  8     1  ID
# 5  9     3  ID
# 
# [[4]]
#   team ID Count      new
# 1    x  4     1 team, ID
# 2    x  7     1 team, ID
# 3    x  9     2 team, ID
# 4    y  4     1 team, ID
# 5    y  5     1 team, ID
# 6    y  8     1 team, ID
# 7    y  9     1 team, ID

CodePudding user response:

Does this, based on tidyverse, give you what you want?

library(tidyverse)

ytable %>% 
  group_by(team, ID) %>% 
  group_split()
<list_of<
  tbl_df<
    group: character
    team : character
    num  : integer
    ID   : integer
  >
>[7]>
[[1]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     x         2     4

[[2]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 b     x         1     7

[[3]]
# A tibble: 2 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     x         1     9
2 b     x         3     9

[[4]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 b     y         4     4

[[5]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     y         3     5

[[6]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 b     y         2     8

[[7]]
# A tibble: 1 × 4
  group team    num    ID
  <chr> <chr> <int> <int>
1 a     y         4     9

  • Related