Home > other >  combining r dataframe columns dynamically
combining r dataframe columns dynamically

Time:10-06

I have a function in which the user enters a string vector of columns and data frame as arguments and it returns dataframe with a new column in which the elements of the columns are concatenated as follows:

Dataframe:

df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
           z = c("Test1","Test2", "Test3","Test4","Test5"),
           w =c("B1","B2","B3","B4","B5"))

if the user defines the vector as vec <- c("x","y") then the output shall be:

newcol <- function(df, vec){
  df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), sep = ","))
  return (df)
}

newcol(df, vec)

  x y     z  w newcolumn
1 A 1 Test1 B1       A,1
2 B 2 Test2 B2       B,2
3 C 3 Test3 B3       C,3
4 D 4 Test4 B4       D,4
5 E 5 Test5 B5       E,5

and if the vec <- c("x","y", "z") then the output shall be as follows:

newcol <- function(df, vec){
  df <- df %>% mutate(newcolumn = paste(get("x"),get("y"), get("z"), sep = ","))
  return (df)
}

newcol(df, vec)

  x y     z  w newcolumn
1 A 1 Test1 B1 A,1,Test1
2 B 2 Test2 B2 B,2,Test2
3 C 3 Test3 B3 C,3,Test3
4 D 4 Test4 B4 D,4,Test4
5 E 5 Test5 B5 E,5,Test5

I wonder how this concatenation can be done dynamically.

CodePudding user response:

Use paste with !!! as shown.

newcol <- function(df, vec){
  df %>% mutate(newcolumn = paste(!!!.[vec], sep = ","))
}

newcol(df, c("x", "y", "z"))
##   x y     z  w newcolumn
## 1 A 1 Test1 B1 A,1,Test1
## 2 B 2 Test2 B2 B,2,Test2
## 3 C 3 Test3 B3 C,3,Test3
## 4 D 4 Test4 B4 D,4,Test4
## 5 E 5 Test5 B5 E,5,Test5

This also works and has no package dependencies.

newcol <- function(df, vec){
  cbind(df, newcolumn = apply(df[vec], 1, paste, collapse = ","))
}

If comma followed by space is ok then this works:

newcol <- function(df, vec){
  cbind(df, newcolumn = apply(df[vec], 1, toString))
}

CodePudding user response:

Using unite:

newcol <- function(df, vec){
  df <- df %>% unite("newcol", vec, sep = ",", remove = F)
  return (df)
}

vec <- c("x","z")
newcol(df,vec)

output:

   newcol x y     z  w
1 A,Test1 A 1 Test1 B1
2 B,Test2 B 2 Test2 B2
3 C,Test3 C 3 Test3 B3
4 D,Test4 D 4 Test4 B4
5 E,Test5 E 5 Test5 B5

CodePudding user response:

If you want to be really clever about it, you can use rlang and tidyselect principles to pass in arguments as names rather than strings:

df <- data.frame(x = c("A","B","C","D","E"), y = c("1","2","3","4","5"),
                 z = c("Test1","Test2", "Test3","Test4","Test5"),
                 w =c("B1","B2","B3","B4","B5"))

library(rlang)
library(dplyr)

newcol <- function(df, ...) {
  vec <- enquos(...)
  df <- df %>% mutate(newcolumn = paste(!!!vec, sep = ","))
  return(df)
}

df |> 
  newcol(x, y)

#>   x y     z  w newcolumn
#> 1 A 1 Test1 B1       A,1
#> 2 B 2 Test2 B2       B,2
#> 3 C 3 Test3 B3       C,3
#> 4 D 4 Test4 B4       D,4
#> 5 E 5 Test5 B5       E,5

df |> 
  newcol(x, y, z)

#>   x y     z  w newcolumn
#> 1 A 1 Test1 B1 A,1,Test1
#> 2 B 2 Test2 B2 B,2,Test2
#> 3 C 3 Test3 B3 C,3,Test3
#> 4 D 4 Test4 B4 D,4,Test4
#> 5 E 5 Test5 B5 E,5,Test5
  • Related