I would like to know how to make new columns that are row sums of existing columns using a loop. Given this data,
df <- data.frame(A=c(22, 25, 29, 13, 22, 30),
B=c(12, 10, 6, 6, 8, 11),
C=c(NA, 15, 15, 18, 22, 13))
I would like to make two columns called a1 and a2, where a1 is the row sums of columns A and B, and a2 is the row sums of columns A, B, and C.
The desired output would look as follows.
---- ---- ---- ---- ----
| A | B | C | a1 | a2 |
---- ---- ---- ---- ----
| 22 | 12 | NA | 34 | 34 |
| 25 | 10 | 15 | 35 | 50 |
| 29 | 6 | 15 | 35 | 50 |
| 13 | 6 | 18 | 19 | 37 |
| 22 | 8 | 22 | 30 | 52 |
| 30 | 11 | 13 | 41 | 54 |
---- ---- ---- ---- ----
I tried the following methods, but these methods are giving me errors.
First, I tried using dplyr
for(i in 1:2) {
df<-df%>%
mutate_(paste0("a",i)= rowSums(df[,1:(1 i)],na.rm=TRUE))
}
Second, I tried using data.table
for(i in 1:2) {
df<-df[,paste0("a",i) := rowSums(df[,1:(1 i)])]
}
I would like to know how to get the desired output in both ways Also, I think using a loop may not be the best method. I also would like to know how to do this using "apply" functions, if possible.
Thank you so much in advance!
CodePudding user response:
Here you go
for(i in 1:2) {
df[[paste0("a",i)]] <- rowSums(df[, 1:(i 1)], na.rm = TRUE)
}
df
A B C a1 a2
1 22 12 NA 34 34
2 25 10 15 35 50
3 29 6 15 35 50
4 13 6 18 19 37
5 22 8 22 30 52
6 30 11 13 41 54
CodePudding user response:
To answer your question about using the apply(df, MARGIN, FUN, ...)
function, all you have to remember is that the margin 1
is for row-wise operations and 2
is used for column-wise operations.
Also, you can add any additional function arguments within the apply
function!
So, in your case, if you use apply(df, 1, sum, na.rm = T)
, the function will calculate all the row sums, while removing all the NA
.
Instead of using dplyr
or data.table
, you could do this by
df["a1"] = apply(df[1:2], 1, sum, na.rm = T)
df["a2"] = apply(df[1:3], 1, sum, na.rm = T)