Define groups of columns and sum all i-th columns of each groups with dplyr-CodePudding

I have two groups of columns, each with 36 columns, and I want to sum all i-th column of group 1 with i-th column of group2, getting 36 columns. The number of columns in each group is not fix in my code, although each group has the same number of them.

Exemple. What I have:

teste <- tibble(a1=c(1,2,3),a2=c(7,8,9),b1=c(4,5,6),b2=c(10,20,30))
     a1    a2    b1    b2
  <dbl> <dbl> <dbl> <dbl>
1     1     7     4    10
2     2     8     5    20
3     3     9     6    30

What I want:

resultado <- teste %>%
  summarise(
    a_b1 = a1 b1,
    a_b2 = a2 b2
  )
   a_b1  a_b2
  <dbl> <dbl>
1     5    17
2     7    28
3     9    39

It would be nice to perform this operation with dplyr.

I would thank any help.

CodePudding user response：

 teste %>%
   summarise(across(starts_with("a"))   across(starts_with("b")))

# A tibble: 3 x 2
     a1    a2
  <dbl> <dbl>
1     5    17
2     7    28
3     9    39

CodePudding user response：

You will struggle to find a dplyr solution as simple and elegant as the base R one:

teste[1:2]   teste[3:4]
#>   a1 a2
#> 1  5 17
#> 2  7 28
#> 3  9 39

Though I guess in dplyr you get the same result with:

teste %>% select(starts_with("a"))   teste %>% select(starts_with("b"))

CodePudding user response：

This might also help in base R:

as.data.frame(do.call(cbind, lapply(split.default(teste, sub("\\D(\\d )", "\\1", names(teste))), rowSums, na.rm = TRUE))) 

  1  2
1 5 17
2 7 28
3 9 39

CodePudding user response：

Another dplyr solution. We can use rowwise and c_across together to sum the values per row. Notice that we can add na.rm = TRUE to the sum function in this case.

library(dplyr)

teste2 <- teste %>%
  rowwise() %>%
  transmute(a_b1 = sum(c_across(ends_with("1")), na.rm = TRUE),
            a_b2 = sum(c_across(ends_with("2")), na.rm = TRUE)) %>%
  ungroup()
  
teste2
# # A tibble: 3 x 2
#    a_b1  a_b2
#   <dbl> <dbl>
# 1     5    17
# 2     7    28
# 3     9    39