I have two data sets, For example:
# First data:
Age<-c(2,2.1,2.2,3.4,3.5,4.2,4.7,4.8,5,5.6,NA, 5.9, NA)
df1<-data.frame(Age)
# Second data:
Age2<-seq(2,20,0.25)
Mspline<-rnorm(73)
df2<-data.frame(Age2, Mspline)
For each value of age in the first data set, I need to find the right value of Mspline (linear interpolation) based on the second data set by the following formula:
Suppose your age in the first data set is 4.8, so the lower bound and upper bound of this age is 4.75 and 5 (these are the two closest values in Age2
). Then, the value of Mspline
for this age would be:*
Age2 Mspline Sspline
4.75 -0.0769 0.1592
5 -0.0752 0.1535
Mspline = -0.0769 ((4.8-4.75)/0.25)·(-0.0752 - (-0.0769)) = -0.0766
Mspline at your age= Mspline of lower bound of age ((your age-lower bound age)/0.25)*(Mspline of lower bound of age)-(Mspline of upper bound of age)
I was wondeing how I can make this function in R?
CodePudding user response:
Here's a function that implements the formula as I understand it:
Age<-c(2,2.1,2.2,3.4,3.5,4.2,4.7,4.8,5,5.6,NA, 5.9, NA)
# Second data:
Age2<-seq(2,20,0.25)
Mspline<-rnorm(73)
res <- lapply(1:length(Age), \(x){
lwr_ind <- max(which(Age2 <= Age[x]))
upr_ind <- min(which(Age2 >= Age[x]))
data.frame(Age = Age[x],
Mspline = Mspline[lwr_ind] ((Age[x]-Age2[lwr_ind])/0.25)*(Mspline[lwr_ind] - Mspline[upr_ind]))
})
res <- do.call(rbind, res)
res
#> Age Mspline
#> 1 2.0 -0.48510574
#> 2 2.1 -0.45222184
#> 3 2.2 -0.41933793
#> 4 3.4 -1.05075284
#> 5 3.5 1.03415440
#> 6 4.2 0.90388824
#> 7 4.7 1.05441685
#> 8 4.8 -1.86696649
#> 9 5.0 -0.60582730
#> 10 5.6 -0.76802820
#> 11 NA NA
#> 12 5.9 0.09647946
#> 13 NA NA
Created on 2022-10-19 by the reprex package (v2.0.1)
The remaining question is whether you're trying to do something with the missing values (NA
). The missing values in Age
continue to be missing in the result above.