Home > database >  R Creating new columns using for loop that generate vectors contains name of variables
R Creating new columns using for loop that generate vectors contains name of variables

Time:10-31

I have data and i have vectors that contain name of variables Using this vectors i want to create new columns that contain the sum of variables present vector the vectors are generated by for loop so i don't know the number of vectors neither the variables that are present in each vector i.e in every time i generate vec that contain diffrent variables names

for example : let's assume that my loop will generate these three vectors : Vec when i=1 Vec when i=2 and Vec when i=3

Vec >
Vec >
Vec >
"A","B","C"
"B","D"
"D","E"

Here's the data >data

Name      A    B    C    D    E
r1        1    5    12  21    15
r2        2    4     7  10     9
r3        5   15     6   9     6
r4        7    8     0   7    18

Here's the first result i should obtain ( start with the first vector)

    Name      A    B    C     ABC      D      E          
     r1       1    5   12     18      21     15         
     r2       2    4    7     13      10      9        
     r3       5   15    6     26       9      6         
     r4       7    8    0     15       7     18         

And here's the final result

Name      A    B    C     ABC        D       BD      E          DE
 r1       1    5   12     18         21      26      15         36 
 r2       2    4    7     13         10      14       9         19
 r3       5   15    6     26          9      24       6         15
 r4       7    8    0     15          7      15      18         25

i.e V1 contain name of variables "A" , "B" , "C" and ABC contains the sum of variables A, B and C ans the same for BD( sum of B and D ), and DE (sum of D and E)

Note also that i want the name of my new columns to be the names of columns present in the vectors

Please tell me if you need more informations and more explications or details

CodePudding user response:

A solution with purrr::reduce:

library(tidyverse)

df <- data.frame(
  stringsAsFactors = FALSE,
              Name = c("r1", "r2", "r3", "r4"),
                 A = c(1L, 2L, 5L, 7L),
                 B = c(5L, 4L, 15L, 8L),
                 C = c(12L, 7L, 6L, 0L),
                 D = c(21L, 10L, 9L, 7L),
                 E = c(15L, 9L, 6L, 18L)
      )
vts <- list(c("A","B","C"),c("B","D"),c("D","E"))

reduce(vts, function(x,y)
  bind_cols(x, !!paste0(y,collapse = "") := rowSums(x[,y])), .init=df) 

#>   Name A  B  C  D  E ABC BD DE
#> 1   r1 1  5 12 21 15  18 26 36
#> 2   r2 2  4  7 10  9  13 14 19
#> 3   r3 5 15  6  9  6  26 24 15
#> 4   r4 7  8  0  7 18  15 15 25

CodePudding user response:

For these kind of tasks it is (very) likely inefficient to use a loop to update a data.frame incrementally. Suggested is to use a list approach like the one provided by @PaulSmith.

However, to answer the question, if in case your Vec is the result of some manipulation of ABC, using the tidyverse for the wrangling and glue for the variable name syntax:

library(tidyverse)
update_vec <- function(df, vec){
  vec_name <- paste0(vec, collapse = "")
  x <- parse_expr(paste0(syms(vec), collapse = "   "))
  df %>%
    mutate("{vec_name}" := !!x)
}

Now we can call it once we have a new vector like so

df <- 
tibble::tribble(
  ~Name, ~A, ~B, ~C, ~D, ~E,
  "r1", 1, 5, 12, 21, 15,
  "r2", 2, 4, 7, 10, 9,
  "r3", 5, 15, 6, 9, 6,
  "r4", 7, 8, 0, 7, 18
)
df %>%
  update_vec(Vec1)

# A tibble: 4 x 7
  Name      A     B     C     D     E   ABC
  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 r1        1     5    12    21    15    18
2 r2        2     4     7    10     9    13
3 r3        5    15     6     9     6    26
4 r4        7     8     0     7    18    15

Lastly, some further reading on when and how to reduce duplication / copy-pasting in code.

CodePudding user response:


library(purrr)
library(stringr)

v <- list(c("A","B","C"),c("B","D"),c("D","E"))

express <- setNames(v, map(v, str_flatten)) %>% 
  imap(~ parse_expr(sprintf("rowSums(across(c(%s)))", paste(.x, collapse = ","))))

df %>% 
  mutate(!!! express)

Output

  Name A  B  C  D  E ABC BD DE
1   r1 1  5 12 21 15  18 26 36
2   r2 2  4  7 10  9  13 14 19
3   r3 5 15  6  9  6  26 24 15
4   r4 7  8  0  7 18  15 15 25
  • Related