I have a function in which the user enters a string vector of columns and data frame as arguments and it returns dataframe with a new column in which the elements of the columns are concatenated as follows:
Dataframe:
df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
z = c("Test1","Test2", "Test3","Test4","Test5"),
w =c("B1","B2","B3","B4","B5"))
if the user defines the vector as vec <- c("x","y")
then the output shall be:
newcol <- function(df, vec){
df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), sep = ","))
return (df)
}
newcol(df, vec)
x y z w newcolumn
1 A 1 Test1 B1 A,1
2 B 2 Test2 B2 B,2
3 C 3 Test3 B3 C,3
4 D 4 Test4 B4 D,4
5 E 5 Test5 B5 E,5
and if the vec <- c("x","y", "z")
then the output shall be as follows:
newcol <- function(df, vec){
df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), get("z"), sep = ","))
return (df)
}
newcol(df, vec)
x y z w newcolumn
1 A 1 Test1 B1 A,1,Test1
2 B 2 Test2 B2 B,2,Test2
3 C 3 Test3 B3 C,3,Test3
4 D 4 Test4 B4 D,4,Test4
5 E 5 Test5 B5 E,5,Test5
I wonder how this concatenation can be done dynamically.
CodePudding user response:
Use paste with !!! as shown.
newcol <- function(df, vec){
df %>% mutate(newcolumn = paste(!!!.[vec], sep = ","))
}
newcol(df, c("x", "y", "z"))
## x y z w newcolumn
## 1 A 1 Test1 B1 A,1,Test1
## 2 B 2 Test2 B2 B,2,Test2
## 3 C 3 Test3 B3 C,3,Test3
## 4 D 4 Test4 B4 D,4,Test4
## 5 E 5 Test5 B5 E,5,Test5
This also works and has no package dependencies.
newcol <- function(df, vec){
cbind(df, newcolumn = apply(df[vec], 1, paste, collapse = ","))
}
If comma followed by space is ok then this works:
newcol <- function(df, vec){
cbind(df, newcolumn = apply(df[vec], 1, toString))
}
CodePudding user response:
Using unite
:
newcol <- function(df, vec){
df <- df %>% unite("newcol", vec, sep = ",", remove = F)
return (df)
}
vec <- c("x","z")
newcol(df,vec)
output:
newcol x y z w
1 A,Test1 A 1 Test1 B1
2 B,Test2 B 2 Test2 B2
3 C,Test3 C 3 Test3 B3
4 D,Test4 D 4 Test4 B4
5 E,Test5 E 5 Test5 B5
CodePudding user response:
If you want to be really clever about it, you can use rlang
and tidyselect principles to pass in arguments as names rather than strings:
df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
z = c("Test1","Test2", "Test3","Test4","Test5"),
w =c("B1","B2","B3","B4","B5"))
library(rlang)
library(dplyr)
newcol <- function(df, ...) {
vec <- enquos(...)
df <- df %>% mutate(newcolumn = paste(!!!vec, sep = ","))
return(df)
}
df |>
newcol(x, y)
#> x y z w newcolumn
#> 1 A 1 Test1 B1 A,1
#> 2 B 2 Test2 B2 B,2
#> 3 C 3 Test3 B3 C,3
#> 4 D 4 Test4 B4 D,4
#> 5 E 5 Test5 B5 E,5
df |>
newcol(x, y, z)
#> x y z w newcolumn
#> 1 A 1 Test1 B1 A,1,Test1
#> 2 B 2 Test2 B2 B,2,Test2
#> 3 C 3 Test3 B3 C,3,Test3
#> 4 D 4 Test4 B4 D,4,Test4
#> 5 E 5 Test5 B5 E,5,Test5