Home > OS >  Maintain original column titles with dplyr function in loop R
Maintain original column titles with dplyr function in loop R

Time:10-02

Let's say I have a dataset like mtcars and I would like to loop over different columns to calculate counts (among others).

library(dplyr)

df <- mtcars
groups <- c('cyl', 'hp')

for(g in groups) {
group_counts <- df %>% 
  group_by(get(g)) %>% 
  count()
  
  print(group_counts)
}

Which gives me the following for the cyl column and something similar for the hp column:

     `get(g)`  n
        4    11
        6     7
        8    14

How do I manage to the first column to return the original column name instead of the 'get(g)'? Like this

    cyl     n
     4    11
     6     7
     8    14

CodePudding user response:

You can use across(all_of()) like this:

for(g in groups) {
  group_counts <- df %>% 
    group_by(across(all_of(g))) %>% 
    count()
  
  print(group_counts)
}

Output:

# A tibble: 3 × 2
# Groups:   cyl [3]
    cyl     n
  <dbl> <int>
1     4    11
2     6     7
3     8    14
# A tibble: 22 × 2
# Groups:   hp [22]
      hp     n
   <dbl> <int>
 1    52     1
 2    62     1
 3    65     1
 4    66     2
 5    91     1
 6    93     1
 7    95     1
 8    97     1
 9   105     1
10   109     1
# … with 12 more rows

CodePudding user response:

groups is a character vector. To reference objects with corresponding names, we can convert to symbol and evaluate !!:

for(g in groups) {
    group_counts <- df %>% 
        group_by(!!sym(g)) %>% 
        count()
    
    print(group_counts)
}

# A tibble: 3 × 2
# Groups:   cyl [3]
    cyl     n
  <dbl> <int>
1     4    11
2     6     7
3     8    14
# A tibble: 22 × 2
# Groups:   hp [22]
      hp     n
   <dbl> <int>
 1    52     1
 2    62     1
 3    65     1
 4    66     2
 5    91     1
 6    93     1
 7    95     1
 8    97     1
 9   105     1
10   109     1
# … with 12 more rows
# ℹ Use `print(n = ...)` to see more rows

We can also convert groups into a vector of symbols outside the loop, and evaluate !! inside the loop:

my_function <- function(df, groups) {
    groups <- syms(groups)
    for(g in groups) {
        group_counts <- df %>% 
        group_by(!!g) %>% 
        count()
    print(group_counts)
    }
    }

my_function(df, groups)

# A tibble: 3 × 2
# Groups:   cyl [3]
    cyl     n
  <dbl> <int>
1     4    11
2     6     7
3     8    14
# A tibble: 22 × 2
# Groups:   hp [22]
      hp     n
   <dbl> <int>
 1    52     1
 2    62     1
 3    65     1
 4    66     2
 5    91     1
 6    93     1
 7    95     1
 8    97     1
 9   105     1
10   109     1
# … with 12 more rows
# ℹ Use `print(n = ...)` to see more rows
  • Related