I have data and i have vectors that contain name of variables Using this vectors i want to create new columns that contain the sum of variables present vector the vectors are generated by for loop so i don't know the number of vectors neither the variables that are present in each vector i.e in every time i generate vec that contain diffrent variables names
for example : let's assume that my loop will generate these three vectors : Vec when i=1 Vec when i=2 and Vec when i=3
Vec >
Vec >
Vec >
"A","B","C"
"B","D"
"D","E"
Here's the data >data
Name A B C D E
r1 1 5 12 21 15
r2 2 4 7 10 9
r3 5 15 6 9 6
r4 7 8 0 7 18
Here's the first result i should obtain ( start with the first vector)
Name A B C ABC D E
r1 1 5 12 18 21 15
r2 2 4 7 13 10 9
r3 5 15 6 26 9 6
r4 7 8 0 15 7 18
And here's the final result
Name A B C ABC D BD E DE
r1 1 5 12 18 21 26 15 36
r2 2 4 7 13 10 14 9 19
r3 5 15 6 26 9 24 6 15
r4 7 8 0 15 7 15 18 25
i.e V1 contain name of variables "A" , "B" , "C" and ABC contains the sum of variables A, B and C ans the same for BD( sum of B and D ), and DE (sum of D and E)
Note also that i want the name of my new columns to be the names of columns present in the vectors
Please tell me if you need more informations and more explications or details
CodePudding user response:
A solution with purrr::reduce
:
library(tidyverse)
df <- data.frame(
stringsAsFactors = FALSE,
Name = c("r1", "r2", "r3", "r4"),
A = c(1L, 2L, 5L, 7L),
B = c(5L, 4L, 15L, 8L),
C = c(12L, 7L, 6L, 0L),
D = c(21L, 10L, 9L, 7L),
E = c(15L, 9L, 6L, 18L)
)
vts <- list(c("A","B","C"),c("B","D"),c("D","E"))
reduce(vts, function(x,y)
bind_cols(x, !!paste0(y,collapse = "") := rowSums(x[,y])), .init=df)
#> Name A B C D E ABC BD DE
#> 1 r1 1 5 12 21 15 18 26 36
#> 2 r2 2 4 7 10 9 13 14 19
#> 3 r3 5 15 6 9 6 26 24 15
#> 4 r4 7 8 0 7 18 15 15 25
CodePudding user response:
For these kind of tasks it is (very) likely inefficient to use a loop to update a data.frame incrementally. Suggested is to use a list approach like the one provided by @PaulSmith.
However, to answer the question, if in case your Vec
is the result of some manipulation of ABC
, using the tidyverse
for the wrangling and glue
for the variable name syntax:
library(tidyverse)
update_vec <- function(df, vec){
vec_name <- paste0(vec, collapse = "")
x <- parse_expr(paste0(syms(vec), collapse = " "))
df %>%
mutate("{vec_name}" := !!x)
}
Now we can call it once we have a new vector like so
df <-
tibble::tribble(
~Name, ~A, ~B, ~C, ~D, ~E,
"r1", 1, 5, 12, 21, 15,
"r2", 2, 4, 7, 10, 9,
"r3", 5, 15, 6, 9, 6,
"r4", 7, 8, 0, 7, 18
)
df %>%
update_vec(Vec1)
# A tibble: 4 x 7
Name A B C D E ABC
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 r1 1 5 12 21 15 18
2 r2 2 4 7 10 9 13
3 r3 5 15 6 9 6 26
4 r4 7 8 0 7 18 15
Lastly, some further reading on when and how to reduce duplication / copy-pasting in code.
CodePudding user response:
library(purrr)
library(stringr)
v <- list(c("A","B","C"),c("B","D"),c("D","E"))
express <- setNames(v, map(v, str_flatten)) %>%
imap(~ parse_expr(sprintf("rowSums(across(c(%s)))", paste(.x, collapse = ","))))
df %>%
mutate(!!! express)
Output
Name A B C D E ABC BD DE
1 r1 1 5 12 21 15 18 26 36
2 r2 2 4 7 10 9 13 14 19
3 r3 5 15 6 9 6 26 24 15
4 r4 7 8 0 7 18 15 15 25