I have two separate datasets that I want to merge. Here is the first one (drug users):
> dput(df)
structure(list(ID_druguser = c("123", "234", "324", "345"), Test_Result = c("POSITIVE",
"NEGATIVE", "NEGATIVE", "NEGATIVE"), Year_of_Birth = c("1931",
"1932", "1932", "1932")), class = "data.frame", row.names = c(NA,
-4L))
Here is the second one (non-drug users):
> dput(df2)
structure(list(ID_NONdruguser = c("955", "567", "856", "866"),
Test_Result = c("NEGATIVE", "NEGATIVE", "NEGATIVE", "POSITIVE"
), Year_of_Birth = c("1932", "1932", "1932", "1932")), class = "data.frame", row.names = c(NA,
-4L))
I want to combine the two datasets and make it into a long format, like this:
> dput(df_final)
structure(list(ID = c("123", "234", "324", "345", "955", "567",
"856", "866"), Drug_status = c("Yes", "Yes", "Yes", "Yes", "No",
"No", "No", "No"), Test_Result = c("POSITIVE", "NEGATIVE", "NEGATIVE",
"NEGATIVE", "NEGATIVE", "NEGATIVE", "NEGATIVE", "POSITIVE"),
Year_of_Birth = c("1931", "1932", "1932", "1932", "1932",
"1932", "1932", "1932")), class = "data.frame", row.names = c(NA,
-8L))
The key with df_final
is that I want a column indicating if the user was on a drug.
CodePudding user response:
We could rename
to 'ID' and create a new column and then bind_rows
library(dplyr)
df %>%
rename(ID= ID_druguser) %>%
mutate( Drug_status = 'Yes', .after = ID) %>%
bind_rows( df2 %>%
rename(ID= ID_NONdruguser) %>%
mutate(Drug_status = 'No'))
-output
ID Drug_status Test_Result Year_of_Birth
1 123 Yes POSITIVE 1931
2 234 Yes NEGATIVE 1932
3 324 Yes NEGATIVE 1932
4 345 Yes NEGATIVE 1932
5 955 No NEGATIVE 1932
6 567 No NEGATIVE 1932
7 856 No NEGATIVE 1932
8 866 No POSITIVE 1932