I have this df
df <- data.table(id=c(1,2,3,4,5,6,7,8,9,10),
var1=c(0,4,5,6,99,3,5,5,23,0),
var2=c(22,4,6,25,6,70,75,23,24,21))
I would like to create a third column being:
df <- data.table(id=c(1,2,3,4,5,6,7,8,9,10),
var1=c(0,4,5,6,99,3,5,5,23,0),
var2=c(22,4,6,25,6,70,75,23,24,21),
var3=c("0_22","4_4","5_6","6_25","99_6","3_70","5_75","5_23","23_24","0_21"))
where the value of each cell will be "var1 underscore var2". Var1 and Var2 are categorical variables as they represent medications. Var3 would be to represent a combination of medications.
how can I do this?
thanks!
CodePudding user response:
Load packages
library(data.table)
library(dplyr)
Create dataframe
df <- data.table(
id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
var1 = c(0, 4, 5, 6, 99, 3, 5, 5, 23, 0),
var2 = c(22, 4, 6, 25, 6, 70, 75, 23, 24, 21)
)
Add new variable
By means of dplyr package and sprintf
df <- df %>%
mutate(var3 = sprintf("%d_%d", var1, var2))
By means of dplyr package and paste0
df <- df %>%
mutate(var3 = paste0(var1, "_", var2))
By means of base package and sprintf
df$var3 <- sprintf("%d_%d", df$var1, df$var2)
By means of base package and paste0
df$var3 <- paste0(df$var1, "_", df$var2)
CodePudding user response:
as @Wimpel says, the solution is df$var3 <- paste(df$var1, df$var2, sep = "_") thanks!!
CodePudding user response:
You can do this efficiently using the tidyverse and the unite() function
library(tidyverse)
df <- tibble(id=c(1,2,3,4,5,6,7,8,9,10),
var1=c(0,4,5,6,99,3,5,5,23,0),
var2=c(22,4,6,25,6,70,75,23,24,21)) %>%
# create new variable
unite(var3, c(var1, var2), sep = "_", remove = FALSE)