Home > Enterprise >  Merging two data.tables while doing running sum in R
Merging two data.tables while doing running sum in R

Time:11-07

I have a data.table which represents the dividend payed by a company from 2019 to 2021.

library(data.table)
div_dt <- structure(list(pay_date = structure(c(18885L, 18793L, 18701L, 
18611L, 18520L, 18428L, 18337L, 18246L, 18155L, 18064L, 17910L
), class = c("IDate", "Date")), cash_amount = c(0.09, 0.09, 0.09, 
0.09, 0.08, 0.07, 0.07, 0.05, 0.04, 0.04, 0.07)), row.names = c(NA, 
-11L), class = c("data.table", "data.frame"))

Below is a table showing all calendar days of this stock between 2019 to 2021.

calendar_dt = data.table(current_date = seq(min(div_dt$pay_date), max(div_dt$pay_date), by="days"))

I want to show the sum of the last 4 quarters of dividends this stock has paid on any given date. To solve this I have added a new column div_start_date to calendar_dt, which shows the start date form which dividend has to be added to the given date current_date.

calendar_dt[, div_start_date := date - 365]

Can someone show me how to merge these tables so that for each calendar day in calendar_dt, the sum of dividends of the past 4 quarters is shown in a new column?

A one-liner and memory-efficient solution will be appreciated.

CodePudding user response:

this will work (not the most efficient join, but willl get the job done)

# set keys
setkey(calendar_dt, current_date)
setkey(div_dt, pay_date)
# join
calendar_dt[calendar_dt, 
            cast_last_365 := div_dt[pay_date            
  • Related