Home > Net >  Sum Columns in a dataframe where the names match a vector list
Sum Columns in a dataframe where the names match a vector list

Time:06-14

I have a dataframe made up largely of integers and community names. I have made a list of the community names grouped by their regions like so;

RegionA <- c(a,c,d)
RegionB <- c(b,e,f)
RegionC <- c(g,h,i)

    Year     a     b     c     d     e     f     g     h     i   `5`
   <dbl> <int> <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
 1  2021    61    44     1    78    37    46    33    16    57     5
 2  2020    60    54    60     2    72    59    60    34    60     5
 3  2019    53    77    39    66    85    82    65    95    50     5
 4  2018    78    20    63    26    41    29    19    82    46     5
 5  2017    62    38    22    23     6    11    20    51    65     5
 6  2021    39    15    38    74    90    83    73    12    71     5
 7  2020    28    23    76    57   100    89    62    14    56     5
 8  2019    82    48    40    45    93    72    40    45    29     5
 9  2018    13    69   100    13     5    52    99    52    47     5
10  2017    92    13    13    96    98    17    46    49    74     5

I am trying to select the names from the Regions vector and sum them in a new columns

I have tried using

df <- df %>%
   mutate(Region_A = rowSums(select(., colnames %in% RegionA)))

and

df <- df %>%
   rowwise %>%
   mutate(Region_A = sum(c_across(where(colnames %in% RegionA))))

with no success, getting this error

Caused by error in `match()`:
! 'match' requires vector arguments

What could be the proper solution?

CodePudding user response:

A possible solution:

library(dplyr)

RegionA <- c("a","c","d")
RegionB <- c("b","e","f")
RegionC <- c("g","h","i")

df %>% 
  rowwise %>% 
  mutate(RegionA = sum(c_across(all_of(RegionA))),
         RegionB = sum(c_across(all_of(RegionB))),
         RegionC = sum(c_across(all_of(RegionC)))) %>% 
  ungroup

#> # A tibble: 10 × 13
#>     Year     a     b     c     d     e     f     g     h     i RegionA RegionB
#>    <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>   <int>   <int>
#>  1  2021    61    44     1    78    37    46    33    16    57     140     127
#>  2  2020    60    54    60     2    72    59    60    34    60     122     185
#>  3  2019    53    77    39    66    85    82    65    95    50     158     244
#>  4  2018    78    20    63    26    41    29    19    82    46     167      90
#>  5  2017    62    38    22    23     6    11    20    51    65     107      55
#>  6  2021    39    15    38    74    90    83    73    12    71     151     188
#>  7  2020    28    23    76    57   100    89    62    14    56     161     212
#>  8  2019    82    48    40    45    93    72    40    45    29     167     213
#>  9  2018    13    69   100    13     5    52    99    52    47     126     126
#> 10  2017    92    13    13    96    98    17    46    49    74     201     128
#> # … with 1 more variable: RegionC <int>
  • Related