I have a long dataset with responses at time 1 and time 2.
I am trying to join two other dataframes (df1 and df2) to my primary dataframe (df) to create df_join
. I've tried the following code, but it results in four separate columns: var1.x
, var2.x
, var1.y
, var2.y
where I would like just var1
and var2
What is the best approach here?
library(dplyr)
df <- tibble(ID = c(1001, 1001, 1002, 1002), time = c(1, 2, 1, 2), var3 = c("324908", "1231", "23532", "09094")
df1 <- tibble(ID = c(1001, 1002), time = c(1, 1), var1 = c(300, 400), var2 = c(300, 400))
df2 <- tibble(ID = c(1001, 1002), time = c(2, 2), var1 = c(310, 410), var2 = c(320, 420))
df_join <- df %>%
left_join(select(df1, ID, var1, var2), by = c("ID")) %>%
left_join(select(df2, ID, var1, var2), by = c("ID"))
My desired output would be:
df_join_desire <- tibble(ID = c(1001, 1001, 1002, 1002), time = c(1, 2, 1, 2), var1 = c(300, 310, 400, 410), var2 = c(300, 320, 400, 420), var3 = c(324908, 1231, 23532, 09094))
CodePudding user response:
You're encountering this error because df1
and df2
have the same column names, so you don't need to left_join
them together, you can simply bind them (i.e., putting the new rows of df2
at the bottom of df1
).
df_join <- left_join(df, bind_rows(df1, df2))
CodePudding user response:
You need to find append df1
and df2
to each other.
df_join <- df |> left_join(bind_rows(df1, df2), by = c("ID", "time"))