I have two dataframes:
cash_flows
coupon_date
1 2026-07-31
2 2026-01-31
3 2025-07-31
4 2025-01-31
5 2024-07-31
6 2024-01-31
discount_rates
date df
1 2023-07-25 0.9806698
2 2024-07-25 0.9737091
3 2025-07-25 0.9432057
4 2026-07-27 0.9109546
5 2027-07-26 0.8780984
I would like to create a new column in cash flows
with linearly interpolated values from the dr
column in discount rates
.
The desired output is therefore:
cash_flows
coupon_dates new_column
1 2026-07-31 0.910594
2 2026-01-31 0.926509
3 2025-07-31 0.942678
4 2025-01-31 0.957831
5 2024-07-31 0.973208
6 2024-01-31 0.977056
I have downloaded the forecast
package but still not entirely sure how to achieve this. Any help is appreciated.
Code to replicate dataframes:
cash_flows <- data.frame(coupon_date = as.Date(c("2026-07-31","2026-01-31","2025-07-31", "2024-07-31","2024-01-31")))
drdr <- data.frame(date = as.Date(c("2023-07-25","2024-07-25","2025-07-25", "2026-07-27","2027-07-26")), df = c(0.9806698, 0.9737091, 0.9432057, 0.9109546, 0.8780984))
CodePudding user response:
Here is an option. We fit a linear model df ~ date
to drdr
. Then use predict
to estimate df
for dates in cash_flows
. Note that these numbers don't exactly match your expected input.
fit <- lm(df ~ date, data = drdr)
cash_flows %>%
mutate(new_column = predict(fit, newdata = data.frame(date = coupon_date)))
# coupon_date new_column
#1 2026-07-31 0.9101729
#2 2026-01-31 0.9234352
#3 2025-07-31 0.9369172
#4 2024-07-31 0.9636614
#5 2024-01-31 0.9769969
CodePudding user response:
?approx
can be used for linear interpolation.
Pass the source x
and y
variables from the drdr
data, and specify you want to know the interpolated output y
values based on the cash_flows$coupon_date
x
values (xout=
):
cash_flows$new_column <- approx(x=drdr$date, y=drdr$df, xout=cash_flows$coupon_date)$y
cash_flows
# coupon_date new_column
#1 2026-07-31 0.9105935
#2 2026-01-31 0.9265089
#3 2025-07-31 0.9426784
#4 2024-07-31 0.9732077
#5 2024-01-31 0.9770563
Matches your expected output exactly (with the exception of one row in cash_flows
which isn't in your code to replicate the data.frames, but is shown earlier in the question).