I have two xts files with daily data (The data is only one date for a month). The first file is this: - The dates in this xts are typically end of the month trading dates in a given month.
structure(c(-0.0329199999999997, 0.0874901766141374, 0.0545883292605231,
0.0687945180777207, 0.0550784545301166, 0.074678777314922, -0.0866534235058661,
0.161206236457536, 0.0704023794825748, 0.074691325661258), class = c("xts",
"zoo"), ret_type = "discrete", coredata_content = "discreteReturn", index = structure(c(1114732800,
1117497600, 1120089600, 1122595200, 1125446400, 1128038400, 1130716800,
1133308800, 1135900800, 1138665600), tzone = "UTC", tclass = "Date"), dim = c(10L,
1L), dimnames = list(NULL, "xts_left"))
The second xts file is:
structure(c(0.0052512320343876, 0.00540733325225928, 0.00580017750416384,
0.005701283061746, 0.00556285472234541, 0.00561113650865441,
0.00580424365658105, 0.005816988308881, 0.00571552920344676,
0.00574088497469671, 0.00574737930337577, 0.00589584054618375,
0.00592325487612455), class = c("xts", "zoo"), .CLASS = "double", index = structure(c(1107216000,
1109635200, 1112313600, 1114905600, 1117584000, 1120176000, 1122854400,
1125532800, 1128124800, 1130803200, 1133395200, 1136073600, 1138752000
), tzone = "UTC", tclass = "Date"), dim = c(13L, 1L))
This is how I want the output of the merge: To pick the value from right xts that corresponds to the closest date value in the left xts. For example, the value on 29-04-2005 be matched with the nearest, i.e. 01-05-2005 (dd-mm-yyyy format).
I have seen a possible way to do this using data.table with rolling joins, but I wanted to know if there is a way to do this within the xts (or similar) framework.
CodePudding user response:
Using x1 and x2 in the Note at the end define near
which given a date, tt, finds the nearest date in x2 and returns the corresponding data value. Then apply that to each date in x1.
near <- function(tt) x2[which.min(abs(time(x2) - tt))]
x12 <- transform(x1, xts_right = sapply(time(x1), near)); x12
giving:
xts_left xts_right
2005-04-29 -0.03292000 0.005701283
2005-05-31 0.08749018 0.005562855
2005-06-30 0.05458833 0.005611137
2005-07-29 0.06879452 0.005804244
2005-08-31 0.05507845 0.005816988
2005-09-30 0.07467878 0.005715529
2005-10-31 -0.08665342 0.005740885
2005-11-30 0.16120624 0.005747379
2005-12-30 0.07040238 0.005895841
2006-01-31 0.07469133 0.005923255
In the example shown in the question the nearest x2 is always at a strictly later date than x1 and x2 starts before x1. If those are general features of the problem it could alternately be expressed as:
transform(x1, xts_right = coredata(x2)[findInterval(time(x1), time(x2)) 1])
Note
x1 <-
structure(c(-0.0329199999999997, 0.0874901766141374, 0.0545883292605231,
0.0687945180777207, 0.0550784545301166, 0.074678777314922, -0.0866534235058661,
0.161206236457536, 0.0704023794825748, 0.074691325661258), .Dim = c(10L,
1L), class = c("xts", "zoo"), ret_type = "discrete",
coredata_content = "discreteReturn", index = structure(c(1114732800,
1117497600, 1120089600, 1122595200, 1125446400, 1128038400, 1130716800,
1133308800, 1135900800, 1138665600), tzone = "UTC", tclass = "Date"),
.Dimnames = list(NULL, "xts_left"))
x2 <-
structure(c(0.0052512320343876, 0.00540733325225928, 0.00580017750416384,
0.005701283061746, 0.00556285472234541, 0.00561113650865441,
0.00580424365658105, 0.005816988308881, 0.00571552920344676,
0.00574088497469671, 0.00574737930337577, 0.00589584054618375,
0.00592325487612455), .Dim = c(13L, 1L), class = c("xts", "zoo"
), .CLASS = "double", index = structure(c(1107216000, 1109635200,
1112313600, 1114905600, 1117584000, 1120176000, 1122854400, 1125532800,
1128124800, 1130803200, 1133395200, 1136073600, 1138752000),
tzone = "UTC", tclass = "Date"))