Home > Software engineering >  r rowSums in case_when
r rowSums in case_when

Time:09-19

Considering this is my dataset

df <- 
structure(list(id = 1:6, w = c(NA, 11L, 12L, 13L, 14L, 15L), 
    x = c(20L, 21L, NA, 23L, 24L, 25L), y = c(30L, 31L, 32L, 
    NA, 34L, 35L), z = c(40L, NA, 42L, 43L, 44L, NA), Group = c("Group1", 
    "Group2", "Group2", "Group2", "Group3", "Group3")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))


    id     w     x     y     z Group   
     1    NA    20    30    40 Group1
     2    11    21    31    NA Group2
     3    12    NA    32    42 Group2
     4    13    23    NA    43 Group2
     5    14    24    34    44 Group3
     6    15    25    35    NA Group3

I am trying to sum observations from columns w, x, y,z row wise by Groups. I am using case_when statement like this.

df <- df %>%
  mutate(
    
    group1_total <- case_when( Group = "Group1" ~ rowSums(w,x, na.rm = TRUE)),
    group2_total <- case_when( Group = "Group2" ~ rowSums(w,x,z, na.rm = TRUE)),
    group3_total <- case_when( Group = "Group3" ~ rowSums(w,x,y,z, na.rm = TRUE))   
    
  )

I am getting an error saying, error in rowSums(). Not sure what I am doing wrong. Any suggestions or help is much appreciated, Thanks.

CodePudding user response:

I would create a named list where we specify the variables to be summed for each group.

cur_data() returns the data for the current group, group_cols[[unique(Group)]] selects the appropriate columns for each group.

library(dplyr)

group_cols <- list('Group1' = c('w', 'x'), 'Group2' = c('w', 'x', 'z'), 
                   'Group3' = c('w', 'x', 'y', 'z'))

df %>%
  group_by(Group) %>%
  mutate(total = rowSums(select(cur_data(), 
                         group_cols[[unique(Group)]]), na.rm = TRUE)) %>%
  ungroup

#     id     w     x     y     z Group  total
#  <int> <int> <int> <int> <int> <chr>  <dbl>
#1     1    NA    20    30    40 Group1    20
#2     2    11    21    31    NA Group2    32
#3     3    12    NA    32    42 Group2    54
#4     4    13    23    NA    43 Group2    79
#5     5    14    24    34    44 Group3   116
#6     6    15    25    35    NA Group3    75

The benefit of this is the calculation is done per group instead of per row.

CodePudding user response:

You could the combination of rowwise(), c_across() (c_across() is designed to work with rowwise() to make it easy to perform row-wise aggregations) and sum(), e.g.

library(dplyr)

df %>%
 rowwise() %>%   
  mutate(
    group1_total = case_when( Group == "Group1" ~ sum(c_across(w:x), na.rm = TRUE)),
    group2_total = case_when( Group == "Group2" ~ sum(c_across(c(w,x,z)), na.rm = TRUE)),
    group3_total = case_when( Group == "Group3" ~ sum(c_across(w:z), na.rm = TRUE))   
  )

output:

     id     w     x     y     z Group  group1_total group2_total group3_total
  <int> <int> <int> <int> <int> <chr>         <int>        <int>        <int>
1     1    NA    20    30    40 Group1           20           NA           NA
2     2    11    21    31    NA Group2           NA           32           NA
3     3    12    NA    32    42 Group2           NA           54           NA
4     4    13    23    NA    43 Group2           NA           79           NA
5     5    14    24    34    44 Group3           NA           NA          116
6     6    15    25    35    NA Group3           NA           NA           75

If you are interested in one total column you could then use coalesce(), eg.

df %>%
  rowwise() %>%   
  mutate(
    group1_total = case_when( Group == "Group1" ~ sum(c_across(w:x), na.rm = TRUE)),
    group2_total = case_when( Group == "Group2" ~ sum(c_across(c(w,x,z)), na.rm = TRUE)),
    group3_total = case_when( Group == "Group3" ~ sum(c_across(w:z), na.rm = TRUE))   
  ) %>% 
  mutate(total = coalesce(group1_total, group2_total,group3_total)) %>% 
  select(-contains("_total"))

Resulting in

# A tibble: 6 x 7
# Rowwise: 
     id     w     x     y     z Group  total
  <int> <int> <int> <int> <int> <chr>  <int>
1     1    NA    20    30    40 Group1    20
2     2    11    21    31    NA Group2    32
3     3    12    NA    32    42 Group2    54
4     4    13    23    NA    43 Group2    79
5     5    14    24    34    44 Group3   116
6     6    15    25    35    NA Group3    75
  • Related