I have an extended dataset that I would simplify like this:
BEZ <- c(0.5, 1.5)
var <- c(0, 1.5 )
bar <- c(3, 1.5)
BEZ1 <- c(0, 0.5)
var1 <- c(4, 4)
bar1 <- c(4, 4.5)
dat <- data.frame(BEZ, var, bar, BEZ1, var1, bar1)
dat
What I would like to do is to add two final columns reporting two sums:
- the sum of every middle columns different from those
BEZ
; - the division between the first result and the sum of columns having
'BEZ'
in their names.
I have used this way:
scores = dat %>%
select(-starts_with('BEZ')) %>%
#replace(is.na(.), 0) %>%
mutate(score_1 = rowSums(.),
score_2 = score_1/rowSums(dat %>% select(starts_with('BEZ'))))
new = cbind(dat, scores[, 5:6])
new
But I am looking for a way that would be easier and avoiding to create different chunks of code. Could you suggest any other different alternatives?
Thanks
CodePudding user response:
You can use across()
to select columns on which you want to do row-summing:
library(dplyr)
dat %>%
mutate(score_1 = rowSums(across(!starts_with('BEZ'))),
score_2 = score_1 / rowSums(across(starts_with('BEZ'))))
# BEZ var bar BEZ1 var1 bar1 score_1 score_2
# 1 0.5 0.0 3.0 0.0 4 4.0 11.0 22.00
# 2 1.5 1.5 1.5 0.5 4 4.5 11.5 5.75