Home > Software engineering >  linear interpolation based on look up tables in R (Finding the Age Lower bound and Upper bound)
linear interpolation based on look up tables in R (Finding the Age Lower bound and Upper bound)

Time:10-20

I have two data sets, For example:

# First data:
Age<-c(2,2.1,2.2,3.4,3.5,4.2,4.7,4.8,5,5.6,NA, 5.9, NA)
df1<-data.frame(Age)
# Second data:
Age2<-seq(2,20,0.25)    
Mspline<-rnorm(73)
df2<-data.frame(Age2, Mspline)

For each value of age in the first data set, I need to find the right value of Mspline (linear interpolation) based on the second data set by the following formula:

Suppose your age in the first data set is 4.8, so the lower bound and upper bound of this age is 4.75 and 5 (these are the two closest values in Age2). Then, the value of Mspline for this age would be:*

Age2      Mspline     Sspline 
4.75    -0.0769      0.1592 
5       -0.0752      0.1535 
Mspline = -0.0769   ((4.8-4.75)/0.25)·(-0.0752 - (-0.0769)) = -0.0766

Mspline at your age= Mspline of lower bound of age  ((your age-lower bound age)/0.25)*(Mspline of lower bound of age)-(Mspline of upper bound of age)

I was wondeing how I can make this function in R?

CodePudding user response:

Here's a function that implements the formula as I understand it:

Age<-c(2,2.1,2.2,3.4,3.5,4.2,4.7,4.8,5,5.6,NA, 5.9, NA)

# Second data:
Age2<-seq(2,20,0.25)    
Mspline<-rnorm(73)

res <- lapply(1:length(Age), \(x){
  lwr_ind <- max(which(Age2 <= Age[x]))
  upr_ind <- min(which(Age2 >= Age[x]))
  data.frame(Age = Age[x], 
             Mspline = Mspline[lwr_ind]   ((Age[x]-Age2[lwr_ind])/0.25)*(Mspline[lwr_ind] - Mspline[upr_ind]))
})

res <- do.call(rbind, res)
res
#>    Age     Mspline
#> 1  2.0 -0.48510574
#> 2  2.1 -0.45222184
#> 3  2.2 -0.41933793
#> 4  3.4 -1.05075284
#> 5  3.5  1.03415440
#> 6  4.2  0.90388824
#> 7  4.7  1.05441685
#> 8  4.8 -1.86696649
#> 9  5.0 -0.60582730
#> 10 5.6 -0.76802820
#> 11  NA          NA
#> 12 5.9  0.09647946
#> 13  NA          NA

Created on 2022-10-19 by the reprex package (v2.0.1)

The remaining question is whether you're trying to do something with the missing values (NA). The missing values in Age continue to be missing in the result above.

  • Related