I am trying to join to a lookup table based on the field value of my base table "Score" being between two field values in the lookup table ("ATTRIBUTE_2" and "ATTRIBUTE_3").
Basically the R version of the following SQL:
SELECT base.PolicyNo
,base.Score
,lkp.value ScoreFactor
FROM base base
INNER JOIN lkp lkp
ON base.PERIL = lkp.PERIL
AND base.COVERAGE = lkp.COVERAGE
AND base.LocStCd = lkp.ATTRIBUTE_1
AND base.Score > lkp.ATTRIBUTE_2
AND base.Score <= lkp.ATTRIBUTE_3
I'm very new to R and still trying to grasp a lot of the syntax, so please be detailed in your responses :)
I've tried
TIERLKP[DF1, on = c("PERIL","COVERAGE",ATTRIBUTE_1 = "LocStCd",ATTRIBUTE_2 < "Score",ATTRIBUTE_3 >= "TOTALTIERSCOREVA"), nomatch=NA] and R tells me that ATTRIBUTE_2 and ATTRIBUTE_3 can't be found
I also tried
u = TIERLKP[DF1, on = c("PERIL","COVERAGE",ATTRIBUTE_1 = "LocStCd"), nomatch=NA]
u = u[u$ATTRIBUTE_2 < u$Score,]
u = u[u$ATTRIBUTE_3 >= u$Score,]
Which results in massive duplication of rows and the need for a cartesian join...doesn't seem efficient.
Per Phil's comment I tried left_join(DF1,TIERLKP, by = c("PERIL","COVERAGE","LocStCd" = "ATTRIBUTE_1","Score" > "ATTRIBUTE_2", "Score" <= "ATTRIBUTE_3"),copy = TRUE) with/without quotes as well as changing the order but am still getting "Error: Join columns must be present in data."
CodePudding user response:
We may use the data.table
syntax as below. Also, convert to data.table
if the data is data.frame
with setDT
library(data.table)
setDT(TIERLKP)
setDT(DF1)
TIERLKP[DF1, on = .(PERIL, COVERAGE,ATTRIBUTE_1 = LocStCd,
ATTRIBUTE_2 < Score,
ATTRIBUTE_3 >= TOTALTIERSCOREVA), nomatch=NA]
In the devel
version of dplyr
, we can use join_by
which does the non-equi join
library(dplyr)
inner_join(TIERLKP, DF1, by = join_by(PERIL, COVERAGE,
ATTRIBUTE_1 = LocStCd,
ATTRIBUTE_2 < Score,
ATTRIBUTE_3 >= TOTALTIERSCOREVA))