Home > Net >  How to add a sum column considering a pattern in name column if a condition is TRUE and substract it
How to add a sum column considering a pattern in name column if a condition is TRUE and substract it

Time:11-30

I have this:

transaction_id <- c(17,95,27,16,8)
r1<- c(84,28,18,51,98)
r1_added<- c(66,15,56,22,74)
r2<-c(50,58,52,90,97)
r2_added <-c(31,95,28,80,9)
r3<-c(20,91,61,87,11)
r3_added <-c(68,38,99,43,49)
condition <- c(0,1,0,1,0)
m <- as.data.frame(cbind(idc,transaction_id,r1,r1_added,r2,r2_added,r3,r3_added,condition))
idc transaction_id r1 r1_added r2 r2_added r3 r3_added condition
1 17 84 66 50 31 20 68 0
2 95 28 15 58 95 91 38 1
3 27 18 56 52 28 61 99 0
4 16 51 22 90 80 87 43 1
5 8 98 74 97 9 11 49 0

I need the sum of all values of columns names that ends with "added" in a new column but substract or assign 0 to those columns value

I need this: rows with condition 1 has to sum in "sum_added" but 0 in the columns that sum

idc transaction_id r1 r1_added r2 r2_added r3 r3_added condition sum_added
1 17 84 66 50 31 20 68 0 0
2 95 28 0 58 0 91 0 1 148
3 27 18 56 52 28 61 99 0 0
4 16 51 0 90 0 87 0 1 145
5 8 98 74 97 9 11 49 0 0

I was trying with dyplyr:

m%>%rowwise()%>%mutate(sum_added=ifelse(condition==1,sum(c_across(contains("_added"))),0))

but i don't know how to assign 0 to columns that sum, considering i have a lot of columns "added" not only 3 of this example

I need clarify, i need for example r1_added =0 , r2_added=0,,, and so on,updating values.

CodePudding user response:

We can select the columns that ends_with "_added" with across (or pick), get the row wise sum (rowSums), and multiply with condition (values that are 0 in condition will return 0 and those will 1 return the sum value), then loop across the _added and multiply with the negated (!) condition to modify the values in _added to 0

library(dplyr)
df1 <- df1 %>%
    mutate(sum_added = rowSums(pick(ends_with('_added')), 
     na.rm = TRUE) * condition) %>%
   mutate(across(c(ends_with('_added'),-sum_added), ~ .x * !condition))

-output

df1
 idc transaction_id r1 r1_added r2 r2_added r3 r3_added condition sum_added
1   1             17 84       66 50       31 20       68         0         0
2   2             95 28        0 58        0 91        0         1       148
3   3             27 18       56 52       28 61       99         0         0
4   4             16 51        0 90        0 87        0         1       145
5   5              8 98       74 97        9 11       49         0         0

data

df1 <- structure(list(idc = 1:5, transaction_id = c(17L, 95L, 27L, 16L, 
8L), r1 = c(84L, 28L, 18L, 51L, 98L), r1_added = c(66L, 15L, 
56L, 22L, 74L), r2 = c(50L, 58L, 52L, 90L, 97L), r2_added = c(31L, 
95L, 28L, 80L, 9L), r3 = c(20L, 91L, 61L, 87L, 11L), r3_added = c(68L, 
38L, 99L, 43L, 49L), condition = c(0L, 1L, 0L, 1L, 0L)), 
class = "data.frame", row.names = c(NA, 
-5L))
  • Related