Home > database >  Loop through specific columns of dataframe keeping some columns as fixed
Loop through specific columns of dataframe keeping some columns as fixed

Time:05-04

I have a large dataset with the two first columns that serve as ID (one is an ID and the other one is a year variable). I would like to compute a count by group and to loop over each variable that is not an ID one. This code below shows what I want to achieve for one variable:

library(tidyverse)

df <- tibble(
  ID1 = c(rep("a", 10), rep("b", 10)),
  year = c(2001:2020),
  var1 = rnorm(20),
  var2 = rnorm(20))

df %>%
  select(ID1, year, var1) %>%
  filter(if_any(starts_with("var"), ~!is.na(.))) %>%
  group_by(year) %>%
  count() %>%
  print(n = Inf)

I cannot use a loop that starts with for(i in names(df)) since I want to keep the variables "ID1" and "year". How can I run this piece of code for all the columns that start with "var"? I tried using quosures but it did not work as I receive the error select() doesn't handle lists. I also tried to work with select(starts_with("var") but with no success. Many thanks!

CodePudding user response:

Another possible solution:

library(tidyverse)

df %>% 
  group_by(ID1) %>% 
  summarise(across(starts_with("var"), ~ length(na.omit(.x))))

#> # A tibble: 2 × 3
#>   ID1    var1  var2
#>   <chr> <int> <int>
#> 1 a        10    10
#> 2 b        10    10

CodePudding user response:

for(i in names(df)[grepl('var',names(df))])
  • Related