I want to merge df1 and df2, by pulling in only the numerics from df2 into df1? This would be a nested (Lookup(xlookup) in excel, but Im having difficulty working this in r? Any info would be much appreciated.
df1 <- data.frame(Names = c("A","B","C","D","E"),
Rank = c("R1","R3","R4","R2","R5"),
Time_in_rank = c(2,4.25,3,1.5,5))
df2 <- data.frame(Time_in_rank =c(0,1,2,3,4),
R1 =c(20000,25000,30000,35000,40000),
R2 =c(45000,50000,55000,60000,65000),
R3 =c(70000,75000,80000,85000,90000),
R4 =c(95000,96000,97000,98000,100000),
R5 =c(105000,107000,109000,111000,112000))
Desired output
Names Time_in_rank Rank Salary
A 2 R1 30000
B 4.25 R3 90000
C 3 R4 98000
Close but no cigar - Merge two data frames considering a range match between key columns
Close but still no cigar - Complex non-equi merge in R
CodePudding user response:
data.table
library(data.table)
setDT(df1)
setDT(df2)
df1[setnames(melt(df2, id.vars = "Time_in_rank"), 1, "tir"
)[, tir2 := shift(tir, type = "lead", fill = Inf), by = variable],
Salary := value,
on = .(Rank == variable, Time_in_rank >= tir, Time_in_rank < tir2)]
df1
# Names Rank Time_in_rank Salary
# <char> <char> <num> <num>
# 1: A R1 2.00 30000
# 2: B R3 4.25 90000
# 3: C R4 3.00 98000
# 4: D R2 1.50 50000
# 5: E R5 5.00 112000
CodePudding user response:
You could use a tidyverse
approach:
library(dplyr)
library(tidyr)
df1 %>%
mutate(time_floor = floor(Time_in_rank)) %>%
left_join(df2 %>%
pivot_longer(-Time_in_rank, names_to = "Rank", values_to = "Salary"),
by = c("Rank", "time_floor" = "Time_in_rank")) %>%
select(-time_floor)
This returns
Names Rank Time_in_rank Salary
1 A R1 2.00 30000
2 B R3 4.25 90000
3 C R4 3.00 98000
4 D R2 1.50 50000
5 E R5 5.00 NA
The main ideas are:
df1
'sTime_in_rank
rounded down corresponds toTime_in_rank
used indf2
.df2
is converted into a long format to join by time and rank.