Home > database >  R lapply looping into flexible vector and rename variable through suffix
R lapply looping into flexible vector and rename variable through suffix

Time:11-10

I have a flexible vector of combinations (in real life it could vary a lot and depends on an external table, so I could not slicing or using across, or something depending on the name itself of variables in my df).

I would like to group/sum up the variables in my df, whose name matches the names in "possible comb" vector. Then applying a "_sum" suffix to output variable names, e.g. Jon.A_sum.

In my df, I have several variables, not all should be summed up, but only a selected and flexable list matching with "possible comb" names.

In this code I miss how to rename the output variables with _sum suffix in the lapply step, if possible, but I'm open to other approaches of looping.


possible_comb <- c("Jon.A", "Bill.C", "Maria.E", "Ben.D")

Jon.A <- c(23, 41, 32, 58, 26)
Bill.C <- c(13, 41, 35, 18, 66)
v3 <- c(3,34, 33, 34, 23)
weight <- c(2, 2, 3,3, 6)

df <- data.frame(Jon.A,Bill.C,v3,weight)

setDT(df)

df_grouped<- df[, lapply(.SD, sum), by=c("weight") , .SDcols=possible_comb] 

#wanted results

Jon.A_sum <- c(64, 90, 26)
Bill.C_sum <- c(54,53, 66)
weight <- c(2,3, 6)

wanted <- data.frame(Jon.A_sum,Bill.C_sum,weight)

CodePudding user response:

data.table solution -

library(data.table)

possible_comb <- c("Jon.A", "Bill.C")
new_cols <- paste0(possible_comb, '_sum')

df_grouped<- df[, setNames(lapply(.SD, sum), new_cols), 
                   by=c("weight") , .SDcols=possible_comb] 

df_grouped

#   weight Jon.A_sum Bill.C_sum
#1:      2        64         54
#2:      3        90         53
#3:      6        26         66

In dplyr you can use across with group_by and assign new names with .names.

library(dplyr)

df %>%
  group_by(weight) %>%
  summarise(across(all_of(possible_comb), sum, .names = '{col}_sum'))

#  weight Jon.A_sum Bill.C_sum
#   <dbl>     <dbl>      <dbl>
#1      2        64         54
#2      3        90         53
#3      6        26         66

CodePudding user response:

If I understand your desired output correctly you can do something like this:

cols_to_use <- possible_comb[names(df) %in% possible_comb]
df_grouped<- df[, lapply(.SD, sum), by=c("weight") , .SDcols = cols_to_use]
setcolorder(df_grouped, cols_to_use)
setnames(df_grouped, old = cols_to_use, new = paste(cols_to_use, "sum", sep = "_"))
  • Related