I would like to retrieve different close rates from Yahoo finance. Unfortunately the vectors have different lengths which are also due to NA. How can I remove these data series to perform a regression?
AMZN <- diff(log(tseries::get.hist.quote(instrument="AMZN", start= START_DATE, end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", ret)))
nrow(AMZN) #250
SDAX <- diff(log(tseries::get.hist.quote(instrument="^SDAXI", start= START_DATE, end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", ret)))
nrow(SDAX) #254
EURAUD <- diff(log(tseries::get.hist.quote(instrument="EURAUD=X", start= START_DATE, end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", ret)))
nrow(EURAUD) #260
I then combine the individual data into a vector. Due to the different lengths I have NA data. However, the rows of the NA data have to be cleaned up, otherwise no regression analysis is possible.
zDataPreFX <- merge(SDAX, AMZN, EURAUD)
CodePudding user response:
So this should do the trick if there are any NAs, which over the Start/enddate i chose aren't. And BTW what you create are the daily change rates, not close rates.
AMZN <- diff(log(na.omit(tseries::get.hist.quote(instrument="AMZN", start= START_DATE, end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", ret))))
nrow(AMZN) #250
SDAX <- diff(log(na.omit(tseries::get.hist.quote(instrument="^SDAXI", start= START_DATE, end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", ret))))
nrow(SDAX) #254
EURAUD <- diff(log(na.omit(tseries::get.hist.quote(instrument="EURAUD=X", start= START_DATE, end=END_DATE, quote=c( "Close"), provider= "yahoo", compression="d", ret))))
nrow(EURAUD) #260
CodePudding user response:
You can combine all of the data via a merge. Then use na.omit
to remove the rows with NA values in them. See the code example below.
start_date = "2021-01-01"
end_date="2021-12-31"
AMZN <- diff(log(tseries::get.hist.quote(instrument = "AMZN",
start = start_date,
end = end_date,
quote = c("Close"),
provider = "yahoo",
compression ="d",
ret)))
SDAX <- diff(log(tseries::get.hist.quote(instrument = "^SDAXI",
start = start_date,
end = end_date,
quote=c("Close"),
provider= "yahoo",
compression="d",
ret)))
EURAUD <- diff(log(tseries::get.hist.quote(instrument = "EURAUD=X",
start = start_date,
end = end_date,
quote=c("Close"),
provider= "yahoo",
compression="d",
ret)))
all <- merge(AMZN, SDAX, EURAUD)
head(all)
Close.AMZN Close.SDAX Close.EURAUD
2021-01-04 NA NA 0.0251187798
2021-01-05 0.009954627 0.003818769 0.0056628563
2021-01-06 -0.025211817 0.013925576 -0.0083800866
2021-01-07 0.007548605 0.010638004 -0.0038076972
2021-01-08 0.006474567 -0.002391856 0.0008108246
2021-01-11 -0.021754382 -0.009921183 -0.0001139825
all_cleaned <- na.omit(all)
head(all_cleaned)
Close.AMZN Close.SDAX Close.EURAUD
2021-01-05 0.009954627 0.003818769 0.0056628563
2021-01-06 -0.025211817 0.013925576 -0.0083800866
2021-01-07 0.007548605 0.010638004 -0.0038076972
2021-01-08 0.006474567 -0.002391856 0.0008108246
2021-01-11 -0.021754382 -0.009921183 -0.0001139825
2021-01-12 0.002123521 0.009845704 -0.0008616212