Home > Back-end >  How to get the sum of rows using a vector and the make the result in a column
How to get the sum of rows using a vector and the make the result in a column

Time:10-29

I have a dataframe and i want to calculate the sum of variables present in a vector in every row and make the sum in other variable after i want the name of new variable created to be from the name of the variable in vector

for example

data

Name      A_12    B_12    C_12   D_12    E_12
r1        1         5      12      21     15
r2        2         4       7      10      9
r3        5        15      16       9      6
r4        7         8       0       7     18

let's say i have two vectors

vector_1 <- c("A_12","B_12","C_12")
vector_2 <- c("B_12","C_12","D_12","E_12")

The result i want is :

New_data >

 Name        A_12     B_12   C_12   ABC_12     D_12    E_12   BCDE_12
    r1        1         5     12      18         21     15      54
    r2        2         4      7      13         10      9      32
    r3        5        15     16      36          9      6      45
    r4        7         8      0      15          7     18      40

I created for loop to get the sum of the rows in a vector but i didn't get the correct result Please tell me ig you need any more informations or clarifications Thank you

CodePudding user response:

You can use rowSums and simple column-subsetting:

dat$ABC_12 <- rowSums(dat[,vector_1])
dat$BCDE_12 <- rowSums(dat[,vector_2])
dat
#   Name A_12 B_12 C_12 D_12 E_12 ABC_12 BCDE_12
# 1   r1    1    5   12   21   15     18      53
# 2   r2    2    4    7   10    9     13      30
# 3   r3    5   15   16    9    6     36      46
# 4   r4    7    8    0    7   18     15      33

Note that if your frames inherit from data.table, then you'll need to use either subset(dat, select=vector_1) or dat[,..vector_1] instead of simply dat[,vector_1]; if you aren't already using data.table, then you can safely ignore this paragraph.

CodePudding user response:

Like this (using dplyr/tidyverse)

df %>% 
  rowwise() %>%
  mutate(
    ABC_12 = sum(c_across(vector_1)),
    BCDE_12 = sum(c_across(vector_2))
  )

Though I'm not sure the sums are correct in your example

-=-=-=EDIT-=-=-=- Here's a function to help with the naming.

ex_fun <- function(vec, n_len){
  paste0(paste(substr(vec,1,n_len), collapse = ""), substr(vec[1],n_len 1,nchar(vec[1])))
}

Which can then be implemented like so.

df %>% 
  rowwise() %>%
  mutate(
    !!ex_fun(vector_1, 1) := sum(c_across(vector_1)),
    !!ex_fun(vector_2, 1) := sum(c_across(vector_2)),
  )

-=-= Extra note -=--=

If you list your vectors up you could then combine this with r2evans answer and stick into a loop if you prefer.

vectors = list(vector_1, vector_2)

for (v in vectors){
  df[ex_fun(v, 1)] <- rowSums(df[,v])
}
  • Related