I want to make a new column that is the sum of all the columns that start with "m_" and a new column that is the sum of all the columns that start with "w_". Unfortunately it is not every nth column, so indexing all the odd and even columns won't work.
columnnames <- c("m_16", "w_16", "w_17", "m_17", "w_18", "m_18")
values1 <- c(3, 4, 8, 1, 12, 4)
values2 <- c(8, 0, 12, 1, 3, 2)
df <- as.data.frame(rbind(values1, values2))
names(df) <- columnnames
What I want to get is:
columnnames <- c("m_16", "w_16", "w_17", "m_17", "w_18", "m_18", "sum_m", "sum_w")
values1 <- c(3, 4, 8, 1, 12, 4, 8, 24)
values2 <- c(8, 0, 12, 1, 3, 2, 11, 15)
df <- as.data.frame(rbind(values1, values2))
names(df) <- columnnames
So far during my search, I only found how to sum specific columns on conditions but I don't want to specify the columns because there's a lot of them.
Thank you for your help in advance!
CodePudding user response:
Another possible solution:
library(dplyr)
df %>%
mutate(sum_m = across(starts_with("m")) %>% rowSums) %>%
mutate(sum_w = across(starts_with("w")) %>% rowSums)
#> m_16 w_16 w_17 m_17 w_18 m_18 sum_m sum_w
#> values1 3 4 8 1 12 4 8 24
#> values2 8 0 12 1 3 2 11 15
CodePudding user response:
dplyr
has a quick answer:
library(dplyr)
df <- df %>%
mutate(
m_col_sum = select(., starts_with("m")) %>% rowSums(),
w_col_sum = select(., starts_with("w")) %>% rowSums()
)
You might need to specify na.rm = TRUE
as an additional argument to rowSums()
.
CodePudding user response:
# Vector containing the letters which target vectors'
# names start with: names_start_with => character vector
names_start_with <- c("m", "w")
# Compute row-sums, column-bind vectors to data.frame: res => data.frame
res <- cbind(
df,
vapply(
names_start_with,
function(x){
rowSums(df[,startsWith(names(df), x)])
},
numeric(length(names_start_with))
),
row.names = NULL
)
# Output data.frame to console: data.frame => stdout(console)
res
CodePudding user response:
A base solution using rowSums
inside lapply
.
cbind(df, lapply(c(sum_m = "m", sum_w = "w"),
\(x) rowSums(df[startsWith(names(df), x)])))
# m_16 w_16 w_17 m_17 w_18 m_18 sum_m sum_w
#values1 3 4 8 1 12 4 8 24
#values2 8 0 12 1 3 2 11 15
Or in case there are not so many groups simply:
df$sum_m <- rowSums(df[startsWith(names(df), "m")])
df$sum_w <- rowSums(df[startsWith(names(df), "w")])
df
# m_16 w_16 w_17 m_17 w_18 m_18 sum_m sum_w
#values1 3 4 8 1 12 4 8 24
#values2 8 0 12 1 3 2 11 15