I have a dataframe where the first column has dates, the second column has IDs and other columns with values. Some of the values in one column, lets say the third one, should be uptaded or inserted based on another dataframe if columns 1 and 2 match. However not all columns in both dataframes are the same.
There is a solution in Overwrite values from selected columns and matching rows from one data frame into another, R, however it does not work for 2 columns or dataframes with different sizes.
Here is some example data to make what I need clear
df1=structure(list(Date = structure(c(19216, 19216, 19219, 19219), class = "Date"),
ID = c("id1", "id2", "id1", "id2"), X1 = c(-8, -10, 5, 11
), X2 = c(0, 0, 0, 0), X3 = c("A", "A", "A", "A")), row.names = c(NA,
-4L), class = "data.frame")
df2=structure(list(Date = structure(c(19216, 19219, 19220), class = "Date"),
ID = c("id1", "id1", "id1"), X1 = c(-3, 0, 2), Y = c(2, 2,
-1)), class = "data.frame", row.names = c(NA, -3L))
df_result=structure(list(Date = structure(c(19216, 19216, 19219, 19219,
19220), class = "Date"), ID = c("id1", "id2", "id1", "id2", "id1"
), X1 = c(-3, -10, 0, 11, 2), X2 = c(0, 0, 0, 0, NA), X3 = c("A",
"A", "A", "A", NA)), row.names = c(NA, 5L), class = "data.frame")
CodePudding user response:
You can use dplyr::rows_upsert
to update existing rows in your first data frame and add the new ones. Just make sure that both dataframes have the same columns and that there are no duplicates in the second dataframe. Keys that you are trying to update should be unique
library(dplyr)
df_result2 = rows_upsert(df1, select(df2, -Y), by = c("Date", "ID"))
print(df_result2)
Date ID X1 X2 X3
1 2022-08-12 id1 -3 0 A
2 2022-08-12 id2 -10 0 A
3 2022-08-15 id1 0 0 A
4 2022-08-15 id2 11 0 A
5 2022-08-16 id1 2 NA <NA>
This seems to work.