I have a two data tables. One has company wide rating information, the other has price data at a debt-instrument level (company's usually have multiple debt-intsruments) for a historical date range.
The data table with the rating information looks as follows (simplified for illustration):
Company Date Rating
A 2016-02-01 AAA
A 2016-02-02 AA
B 2016-02-01 BBB
B 2016-02-02 A
The debt-instrument data frame looks as follows (simplified for illustration):
Company Debt-Instrument Date Price
A X1 2016-02-01 100
A X1 2016-02-02 101
A X2 2016-02-01 98
A X2 2016-02-02 99
B Y1 2016-02-01 101
B Y1 2016-02-02 100
B Y2 2016-02-01 90
B Y2 2016-02-02 89
As you can see the rating information contains the company name and the date, which is also included in the debt-instrument table.
I would now like to add the rating information to the debt-instrument data table by matching the company name and the date.
The final data table should look as follows:
Company Debt-Instrument Date Price Rating
A X1 2016-02-01 100 AAA
A X1 2016-02-02 101 AA
A X2 2016-02-01 98 AAA
A X2 2016-02-02 99 AA
B Y1 2016-02-01 101 BBB
B Y1 2016-02-02 100 A
B Y2 2016-02-01 90 BBB
B Y2 2016-02-02 89 A
I know how to do a lookup on one single variable, however I do not know how to do it for two variables.
CodePudding user response:
You can use on = .()
in data.table
with multiple columns you would like the join like this:
df1 <- read.table(text = "Company Date Rating
A 2016-02-01 AAA
A 2016-02-02 AA
B 2016-02-01 BBB
B 2016-02-02 A", header = TRUE)
df2 <- read.table(text = "Company Debt-Instrument Date Price
A X1 2016-02-01 100
A X1 2016-02-02 101
A X2 2016-02-01 98
A X2 2016-02-02 99
B Y1 2016-02-01 101
B Y1 2016-02-02 100
B Y2 2016-02-01 90
B Y2 2016-02-02 89", header = TRUE)
library(data.table)
setDT(df1)
setDT(df2)
df2[df1, on = .(Company, Date)]
#> Company Debt.Instrument Date Price Rating
#> 1: A X1 2016-02-01 100 AAA
#> 2: A X2 2016-02-01 98 AAA
#> 3: A X1 2016-02-02 101 AA
#> 4: A X2 2016-02-02 99 AA
#> 5: B Y1 2016-02-01 101 BBB
#> 6: B Y2 2016-02-01 90 BBB
#> 7: B Y1 2016-02-02 100 A
#> 8: B Y2 2016-02-02 89 A
Created on 2022-07-24 by the reprex package (v2.0.1)
CodePudding user response:
With Base R
merge
function
merge(df2 , df1 , by = c("Company" , "Date"))
- output
Company Date Debt.Instrument Price Rating
1 A 2016-02-01 X1 100 AAA
2 A 2016-02-01 X2 98 AAA
3 A 2016-02-02 X1 101 AA
4 A 2016-02-02 X2 99 AA
5 B 2016-02-01 Y1 101 BBB
6 B 2016-02-01 Y2 90 BBB
7 B 2016-02-02 Y1 100 A
8 B 2016-02-02 Y2 89 A