The lengths of two datasets are unequal but they have the same variables. I want to sum the "value" variables of these two datasets by "Date".
Dataset 1:
Date | value |
---|---|
1/1/2000 | 1 |
2/1/2000 | 1 |
3/1/2000 | 2 |
4/1/2000 | 3 |
5/1/2000 | 4 |
6/1/2000 | 5 |
7/1/2000 | 2 |
Dataset 2:
Date | value |
---|---|
2/1/2000 | 5 |
3/1/2000 | 7 |
5/1/2000 | 2 |
7/1/2000 | 9 |
Expected outcome:
Date | value |
---|---|
1/1/2000 | 1 |
2/1/2000 | 6 |
3/1/2000 | 9 |
4/1/2000 | 3 |
5/1/2000 | 6 |
6/1/2000 | 5 |
7/1/2000 | 11 |
CodePudding user response:
The safest option would be a powerjoin
:
library(powerjoin)
power_inner_join(
df1, df2,
by = "Date",
conflict = sum
)
But here, a simple match
should suffice as well:
df1$value <- df1$value df2$value[match(df1$Date, df2$Date)]
CodePudding user response:
You can aggregate
the combined data frames by sum
:
df1 <- structure(list(Date = structure(c(10957, 10958, 10959, 10960,
10961, 10962, 10963), class = "Date"), value = c(1, 1, 2, 3,
4, 5, 2)), class = "data.frame", row.names = c(NA, -7L))
df2 <- structure(list(Date = structure(c(10958, 10959, 10961, 10963), class = "Date"),
value = c(5, 7, 2, 9)), class = "data.frame", row.names = c(NA, -4L))
aggregate(value ~ Date, rbind(df1, df2), sum)
Date value
1 2000-01-01 1
2 2000-01-02 6
3 2000-01-03 9
4 2000-01-04 3
5 2000-01-05 6
6 2000-01-06 5
7 2000-01-07 11