Hi I've got a table (data1) and a numeric vector(quantile) and trying to append calculated columns using the existing data (data1) and the vector(quantile).
such that:
newcol_20% = col1 col2 20%,
newcol_50% = col2 col3 50%,
newcol_70% = col3 col4 70%
data1 and quantile and the desired output (out) are as below
>data1
ID col1 col2 col3 col4
ABC124 10 15 6 15
ABC445 8 8 25 34
ABC550 10 15 5 12
---
ZZZ980 12 21 26 11
ZZZ999 22 19 11 8
> quantile
20% 50% 70%
10 21 35
> out
ID col1 col2 col3 col4 newcol_20% newcol_50% newcol_70%
ABC124 10 15 6 15 35 42 56
ABC445 8 8 25 34 26 54 94
ABC550 10 15 5 12 35 41 52
---
ZZZ980 12 21 26 11 43 68 72
ZZZ999 22 19 11 8 51 51 54
How could I perform above using base R? Any help, suggestions would be appreciated, thanks!
CodePudding user response:
A really simple and efficient way to do this is using the mutate() function from the dplyr package:
library(dplyr)
new_df <- df %>% mutate(new_col_20 = col1 col2 10,
newcol_50 = col2 col3 21,
newcol_70 = col3 col4 35
)
However, if it needs to be using base R, you can just assign a new column using $:
df$new_col_20 <- df$col1 df$col2 10
And similarly for the other two columns.
PS. column names do not accept the '%' symbol.
CodePudding user response:
Here’s a base R solution that will generalize to any number of columns and vector elements:
src_cols <- data1[-1]
qnt_names <- names(quantile)
for (i in seq_along(src_cols)) {
if (i < ncol(src_cols)) {
data1[[paste0("newcol_", qnt_names[[i]])]] <- src_cols[[i]] src_cols[[i 1]] quantile[[i]]
}
}
Result:
ID col1 col2 col3 col4 newcol_20% newcol_50% newcol_70%
1 ABC124 10 15 6 15 35 42 56
2 ABC445 8 8 25 34 26 54 94
3 ABC550 10 15 5 12 35 41 52
4 ZZZ980 12 21 26 11 43 68 72
5 ZZZ999 22 19 11 8 51 51 54
CodePudding user response:
In base R you may use transform.
qs <- c(10, 21, 35)
dat <- transform(dat,
newcol_20=col1 col2 qs[1],
newcol_50=col2 col3 qs[2],
newcol_70=col3 col4 qs[3])
dat
# ID col1 col2 col3 col4 newcol_20 newcol_50 newcol_70
# 1 ABC124 10 15 6 15 35 42 56
# 2 ABC445 8 8 25 34 26 54 94
# 3 ABC550 10 15 5 12 35 41 52
# 4 ZZZ980 12 21 26 11 43 68 72
# 5 ZZZ999 22 19 11 8 51 51 54
PS: Avoid special characters in names, check ?make.names
to quickly learn about the rules of valid names.
dat <- structure(list(ID = c("ABC124", "ABC445", "ABC550", "ZZZ980",
"ZZZ999"), col1 = c(10L, 8L, 10L, 12L, 22L), col2 = c(15L, 8L,
15L, 21L, 19L), col3 = c(6L, 25L, 5L, 26L, 11L), col4 = c(15L,
34L, 12L, 11L, 8L)), class = "data.frame", row.names = c(NA,
-5L))