Home > OS >  Can I use predict() to predict INDEPENDENT variable from dependent variable in a model?
Can I use predict() to predict INDEPENDENT variable from dependent variable in a model?

Time:06-11

I have a growth rate model:

model <- nls(Length~a*exp(-b*exp(-c*Age)), data=df, start=list(a=160,b=0.5, c=0.1))

> summary(model)

Formula: Length ~ a * exp(-b * exp(-c * Age))

Parameters:
   Estimate Std. Error t value Pr(>|t|)    
a 173.03146   12.68100  13.645  < 2e-16 ***
b   0.54255    0.06118   8.868 9.94e-15 ***
c   0.13961    0.04195   3.328  0.00117 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 13.73 on 117 degrees of freedom

Number of iterations to convergence: 12 
Achieved convergence tolerance: 7.693e-06
  (4 observations deleted due to missingness)

and I now want to apply this to a separate dataset, for which I have LENGTH data but not AGE. i.e. I want to predict age (my independent variable in the model) from length (dependent variable). Is this possible?

I would like predicted age to appear as a new column in my dataset so I have a corresponding predicted age for each measured length. I feel like this will be something like:

df2$age.predict <- predict(model, newdata=data.frame(Length=df2$Length))

but I know this isn't right and I don't want to create a new dataframe/list I want it to appear as a column in my df2.

TIA

CodePudding user response:

The bottom line is that, given your model, you can get the predicted value of Age given Length like this:

df2$Age <- -log(log(173.03146 / df2$Length) / 0.54255) / 0.13961

Explanation

You didn't supply any sample data, but we don't really need it. Let's take the results of your regression:

a <- 173.03146
b <-  0.54255
c <-  0.13961

Given your formula, we can predict Length for given values of Age

Age <- 1:50

Length <- a * exp(-b * exp(-c * Age))

And we can see this gives reasonable-looking results:

plot(Age, Length)

Now, to get Age given Length, the easiest thing to do is rearrange the formula so that it is in terms of Age (see footnote for the steps to do this in pseudocode)

Predicted_Age <- -log(log(a / Length) / b) / c

If we have this right, then Predicted_Age, which was calculated only from Length, should be a vector of ages from 1 to 50:

Predicted_Age
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50

Created on 2022-06-10 by the reprex package (v2.0.1)


Footnote - math steps to rearrange formula (pseudocode)

Length == a*exp(-b*exp(-c*Age))            # Original formula
log(Length) == log(a) -b*exp(-c*Age)       # Take log of each side
log(Length) - log(a) == -b*exp(-c*Age)     # Subtract log(a) from each side
log(a) - log(Length) == b * exp(-c*Age)    # Negate both sides
log(a / Length) == b * exp(-c*Age)         # Simplify log on left
log(log(a / Length)) == log(b) - c * Age   # Take log of both sides
log(log(a / Length)) - log(b) == - c * Age # subtract log(b) from each side
log(log(a / Length) / b) == -c * Age       # Simplify left hand side
- log(log(a / Length) / b) / c = Age       # Divide both sides by -c
Age == -log(log(a / Length) / b) / c       # Swap left and right
  • Related