Home > Software design >  Error in eval(predvars, data, env) when doing Bootstrapping
Error in eval(predvars, data, env) when doing Bootstrapping

Time:04-04

I am trying to fit a linear regression mode to examine the impact of COVID on traffic volume. Here is my dataset that have 31 observations which I have normalised using min-max scaling.

structure(list(Date = c("Jul-19", "Aug-19", "Sep-19", "Oct-19", 
"Nov-19", "Dec-19", "Jan-20", "Feb-20", "Mar-20", "Apr-20", "May-20", 
"Jun-20", "Jul-20", "Aug-20", "Sep-20", "Oct-20", "Nov-20", "Dec-20", 
"Jan-21", "Feb-21", "Mar-21", "Apr-21", "May-21", "Jun-21", "Jul-21", 
"Aug-21", "Sep-21", "Oct-21", "Nov-21", "Dec-21", "Jan-22"), 
    Average.Weekday.Daily.Traffic..Veh.day. = c(0.888327194565519, 
    0.957155270369628, 0.895880165623604, 0.942066917846049, 
    1, 0.877361842545003, 0.703731456172012, 0.898082382615854, 
    0.607783746512863, 0, 0.317277553217312, 0.560830088053501, 
    0.706464878860476, 0.747561202987416, 0.737478848514911, 
    0.800118905645907, 0.857496156674019, 0.809655982748128, 
    0.708357519023144, 0.926517717996616, 0.895669090512526, 
    0.856018630896471, 0.925733222167108, 0.825782121234508, 
    0.798338838875814, 0.533903939716948, 0.819263418220707, 
    0.84537340946109, 0.792752384269276, 0.64987564158039, 0.365423786054267
    ), Case = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0.00543532109442455, 
    0.00036093929142663, 5.66179280669224e-05, 0, 0.000134467579158941, 
    0.000339707568401534, 0.000127390338150575, 0.000113235856133845, 
    0, 0.000169853784200767, 0.000106158615125479, 0.00106158615125479, 
    0.0015286840578069, 0, 0, 0.00110404959730499, 0.00312814052569746, 
    0, 0.000268935158317881, 0, 0.00153576129881527, 1), Regulation = c(0, 
    0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0.666666666666667, 0.333333333333333, 
    0.333333333333333, 0.166666666666667, 0.166666666666667, 
    0.166666666666667, 0.166666666666667, 0.333333333333333, 
    0.166666666666667, 0.333333333333333, 0.166666666666667, 
    0.166666666666667, 0.333333333333333, 0.166666666666667, 
    0.166666666666667, 0.166666666666667, 0.166666666666667, 
    0, 0, 0)), row.names = c(NA, -31L), class = "data.frame")

My model and output:

m1 <- lm(Average.Weekday.Daily.Traffic..Veh.day. ~ Case   Regulation, data = data)
summary(m1)

Call:
lm(formula = Average.Weekday.Daily.Traffic..Veh.day. ~ Case   
    Regulation, data = data)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.35325 -0.00991  0.00712  0.05537  0.25166 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.89096    0.03061  29.104  < 2e-16 ***
Case        -0.52838    0.13368  -3.952 0.000477 ***
Regulation  -0.53484    0.08077  -6.622 3.49e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1301 on 28 degrees of freedom
Multiple R-squared:  0.6548,    Adjusted R-squared:  0.6301 
F-statistic: 26.55 on 2 and 28 DF,  p-value: 3.413e-07

Although the independent variables are significant but I found that errors are not normally distributed using Jarque-Bera test. I tried to fix it by doing Bootstrapping with the following codes:

set.seed(4321)
m2 <- Boot(m1, R = 100, method = "residual")
summary(m2)

But I received an error message even though I have included Case variable in my linear model: Error in eval(predvars, data, env) : object 'Case' not found

Could anyone please help explain for me why is it giving this error and how to fix it? Thank you so much for your help.

CodePudding user response:

Your problem is due to a name clash between the name of your data frame and the function data, due to the way Boot and boot use non-standard evaluation. This is why it is best to avoid such names for your own objects. All you need to do is change data to mydata and the problem disappears.

library(car)
#> Loading required package: carData

mydata <- data
m1 <- lm(Average.Weekday.Daily.Traffic..Veh.day. ~ Case   Regulation, data = mydata)

set.seed(4321)
m2 <- Boot(m1, R = 100, method = "residual")
#> Loading required namespace: boot
summary(m2)
#> 
#> Number of bootstrap replications R = 100 
#>             original    bootBias   bootSE  bootMed
#> (Intercept)  0.89096  0.00248002 0.038248  0.89308
#> Case        -0.52838  0.00555811 0.187270 -0.51458
#> Regulation  -0.53484 -0.00088829 0.099065 -0.53900

Created on 2022-04-03 by the reprex package (v2.0.1)

  • Related