I have this:
transaction_id <- c(17,95,27,16,8)
r1<- c(84,28,18,51,98)
r1_added<- c(66,15,56,22,74)
r2<-c(50,58,52,90,97)
r2_added <-c(31,95,28,80,9)
r3<-c(20,91,61,87,11)
r3_added <-c(68,38,99,43,49)
condition <- c(0,1,0,1,0)
m <- as.data.frame(cbind(idc,transaction_id,r1,r1_added,r2,r2_added,r3,r3_added,condition))
idc | transaction_id | r1 | r1_added | r2 | r2_added | r3 | r3_added | condition |
---|---|---|---|---|---|---|---|---|
1 | 17 | 84 | 66 | 50 | 31 | 20 | 68 | 0 |
2 | 95 | 28 | 15 | 58 | 95 | 91 | 38 | 1 |
3 | 27 | 18 | 56 | 52 | 28 | 61 | 99 | 0 |
4 | 16 | 51 | 22 | 90 | 80 | 87 | 43 | 1 |
5 | 8 | 98 | 74 | 97 | 9 | 11 | 49 | 0 |
I need the sum of all values of columns names that ends with "added" in a new column but substract or assign 0 to those columns value
I need this: rows with condition 1 has to sum in "sum_added" but 0 in the columns that sum
idc | transaction_id | r1 | r1_added | r2 | r2_added | r3 | r3_added | condition | sum_added |
---|---|---|---|---|---|---|---|---|---|
1 | 17 | 84 | 66 | 50 | 31 | 20 | 68 | 0 | 0 |
2 | 95 | 28 | 0 | 58 | 0 | 91 | 0 | 1 | 148 |
3 | 27 | 18 | 56 | 52 | 28 | 61 | 99 | 0 | 0 |
4 | 16 | 51 | 0 | 90 | 0 | 87 | 0 | 1 | 145 |
5 | 8 | 98 | 74 | 97 | 9 | 11 | 49 | 0 | 0 |
I was trying with dyplyr:
m%>%rowwise()%>%mutate(sum_added=ifelse(condition==1,sum(c_across(contains("_added"))),0))
but i don't know how to assign 0 to columns that sum, considering i have a lot of columns "added" not only 3 of this example
I need clarify, i need for example r1_added =0 , r2_added=0,,, and so on,updating values.
CodePudding user response:
We can select the columns that ends_with
"_added" with across
(or pick
), get the row wise sum (rowSums
), and multiply with condition
(values that are 0 in condition will return 0 and those will 1 return the sum value), then loop across
the _added
and multiply with the negated (!
) condition to modify the values in _added
to 0
library(dplyr)
df1 <- df1 %>%
mutate(sum_added = rowSums(pick(ends_with('_added')),
na.rm = TRUE) * condition) %>%
mutate(across(c(ends_with('_added'),-sum_added), ~ .x * !condition))
-output
df1
idc transaction_id r1 r1_added r2 r2_added r3 r3_added condition sum_added
1 1 17 84 66 50 31 20 68 0 0
2 2 95 28 0 58 0 91 0 1 148
3 3 27 18 56 52 28 61 99 0 0
4 4 16 51 0 90 0 87 0 1 145
5 5 8 98 74 97 9 11 49 0 0
data
df1 <- structure(list(idc = 1:5, transaction_id = c(17L, 95L, 27L, 16L,
8L), r1 = c(84L, 28L, 18L, 51L, 98L), r1_added = c(66L, 15L,
56L, 22L, 74L), r2 = c(50L, 58L, 52L, 90L, 97L), r2_added = c(31L,
95L, 28L, 80L, 9L), r3 = c(20L, 91L, 61L, 87L, 11L), r3_added = c(68L,
38L, 99L, 43L, 49L), condition = c(0L, 1L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA,
-5L))