Home > Back-end >  I'm trying to predict new values based on a regression with the predict.lm(), but the result sh
I'm trying to predict new values based on a regression with the predict.lm(), but the result sh

Time:03-29

I’m working with a data frame with temperature data from a few weather stations at a given interval of time. Here are the 10 first observations

          data     hora a001 a046
241 2021-03-20 00:00:00 18.4 17.8
242 2021-03-20 01:00:00 17.7 17.8
243 2021-03-20 02:00:00 18.7 17.9
244 2021-03-20 03:00:00 17.6 17.7
245 2021-03-20 04:00:00 18.9 17.7
246 2021-03-20 05:00:00 18.5 17.8
247 2021-03-20 06:00:00 18.0 18.0
248 2021-03-20 07:00:00 17.4 17.2
249 2021-03-20 08:00:00 17.3 17.2

I fitted a regression model to test if temperature data from one station (a001) can be explained by it’s closets station (a046)

bsb_out_fit_mono <- lm(tr_bsb_out$a001 ~ tr_bsb_out$a046)

Then, I used this model to predict temperature data from a previous period, but the results are a bit confusing and I think I may be doing something wrong: the data set I used to build the model had 233 observations (thus, 233 residuals).

> nrow(tr_bsb_out)
[1] 233

The newdata (of a046 temperature values) I used to predict new a001 values has 100 observations (thus I expect to have 100 expected a001 values, right?)

> nrow(prev)
[1] 100

but the prediction gives me 233 values!

predict.lm(bsb_out_fit_mono, newdata = as.data.frame(prev$a046))


       1        2        3        4        5        6        7        8        9       10       11 
18.18776 18.18776 18.26810 18.10742 18.10742 18.18776 18.34845 17.70570 17.70570 17.86639 18.02708 
      12       13       14       15       16       17       18       19       20       21       22 
20.59804 23.24935 24.21346 25.01689 25.65963 26.06135 26.78443 27.10580 26.06135 24.29381 24.69552 
      23       24       25       26       27       28       29       30       31       32       33 
23.00832 21.32113 21.96387 21.64250 20.91941 20.35701 20.67839 19.71427 18.91085 18.18776 19.07153 
      34       35       36       37       38       39       40       41       42       43       44 
19.55359 21.32113 21.64250 21.72284 22.76730 24.21346 25.25792 25.09723 25.73998 25.98100 24.45449 
      45       46       47       48       49       50       51       52       53       54       55 
23.24935 21.56215 20.11599 19.31256 18.58947 19.23222 20.43736 18.02708 18.66982 18.99119 18.91085 
      56       57       58       59       60       61       62       63       64       65       66 
18.91085 18.83050 19.87496 20.99976 21.24078 22.68695 23.81175 24.85621 25.01689 25.17758 25.73998 
      67       68       69       70       71       72       73       74       75       76       77 
24.85621 23.41004 21.24078 21.16044 20.03564 20.83907 20.91941 19.55359 20.11599 18.91085 17.78605 
      78       79       80       81       82       83       84       85       86       87       88 
16.98262 16.58091 15.85782 16.42022 19.23222 23.81175 23.97244 26.22203 25.49895 27.26649 26.62375 
      89       90       91       92       93       94       95       96       97       98       99 
27.26649 27.26649 25.90066 23.81175 21.56215 20.11599 19.71427 18.75016 18.91085 18.10742 18.99119 
     100      101      102      103      104      105      106      107      108      109      110 
17.70570 16.74159 16.50056 18.99119 21.88353 24.61518 25.17758 25.90066 26.70409 27.10580 27.66820 
     111      112      113      114      115      116      117      118      119      120      121 
27.58786 27.34683 25.73998 23.97244 22.20490 20.43736 20.19633 20.51770 18.75016 18.42879 19.07153 
     122      123      124      125      126      127      128      129      130      131      132 
17.14331 18.50913 17.22365 16.90228 16.82193 17.70570 20.91941 23.89209 25.17758 25.57929 26.22203 
     133      134      135      136      137      138      139      140      141      142      143 
26.70409 26.54340 26.86477 26.14169 25.25792 23.41004 20.19633 20.19633 20.11599 19.63393 19.15187 
     144      145      146      147      148      149      150      151      152      153      154 
19.15187 18.66982 18.75016 19.15187 19.07153 19.63393 19.63393 21.40147 23.41004 25.09723 25.57929 
     155      156      157      158      159      160      161      162      163      164      165 
25.65963 25.82032 26.38272 27.10580 26.22203 25.01689 22.92798 20.27667 19.23222 18.58947 17.70570 
     166      167      168      169      170      171      172      173      174      175      176 
17.94673 16.33988 16.66125 16.74159 17.38433 17.38433 18.18776 22.52627 24.69552 25.17758 26.14169 
     177      178      179      180      181      182      183      184      185      186      187 
26.46306 26.54340 28.15026 27.50752 26.22203 26.14169 24.93655 21.48181 21.40147 19.95530 18.83050 
     188      189      190      191      192      193      194      195      196      197      198 
20.19633 19.15187 17.62536 17.86639 18.83050 19.15187 19.71427 18.75016 20.11599 22.68695 24.93655 
     199      200      201      202      203      204      205      206      207      208      209 
25.98100 26.78443 27.26649 28.15026 28.23060 28.47163 27.98957 27.66820 24.69552 23.08867 21.80318 
     210      211      212      213      214      215      216      217      218      219      220 
17.62536 17.70570 16.98262 16.90228 16.17919 16.74159 16.74159 16.09885 16.25954 14.81337 17.70570 
     221      222      223      224      225      226      227      228      229      230      231 
22.68695 24.77586 26.94512 27.34683 28.15026 28.55197 28.71266 27.82889 26.62375 27.82889 24.93655 
     232      233 
22.36558 20.75873 
Warning message:
'newdata' had 100 rows but variables found have 233 rows 

What am I doing wrong?

Here are my df:

> dput(tr_bsb_out)
structure(list(data = structure(c(18706, 18706, 18706, 18706, 
18706, 18706, 18706, 18706, 18706, 18706, 18706, 18706, 18706, 
18706, 18706, 18706, 18706, 18706, 18706, 18706, 18706, 18706, 
18706, 18706, 18707, 18707, 18707, 18707, 18707, 18707, 18707, 
18707, 18707, 18707, 18707, 18707, 18707, 18707, 18707, 18707, 
18707, 18707, 18707, 18707, 18707, 18707, 18707, 18708, 18708, 
18708, 18708, 18708, 18708, 18708, 18708, 18708, 18708, 18708, 
18708, 18708, 18708, 18708, 18708, 18708, 18708, 18708, 18708, 
18708, 18708, 18708, 18709, 18709, 18709, 18709, 18709, 18709, 
18709, 18709, 18709, 18709, 18709, 18709, 18709, 18709, 18709, 
18709, 18709, 18709, 18709, 18709, 18709, 18709, 18709, 18710, 
18710, 18710, 18710, 18710, 18710, 18710, 18710, 18710, 18710, 
18710, 18710, 18710, 18710, 18710, 18710, 18710, 18710, 18710, 
18710, 18710, 18710, 18710, 18711, 18711, 18711, 18711, 18711, 
18711, 18711, 18711, 18711, 18711, 18711, 18711, 18711, 18711, 
18711, 18711, 18711, 18711, 18711, 18711, 18711, 18711, 18711, 
18711, 18712, 18712, 18712, 18712, 18712, 18712, 18712, 18712, 
18712, 18712, 18712, 18712, 18712, 18712, 18712, 18712, 18712, 
18712, 18712, 18712, 18712, 18712, 18712, 18713, 18713, 18713, 
18713, 18713, 18713, 18713, 18713, 18713, 18713, 18713, 18713, 
18713, 18713, 18713, 18713, 18713, 18713, 18713, 18713, 18713, 
18713, 18714, 18714, 18714, 18714, 18714, 18714, 18714, 18714, 
18714, 18714, 18714, 18714, 18714, 18714, 18714, 18714, 18714, 
18714, 18714, 18714, 18714, 18714, 18714, 18714, 18715, 18715, 
18715, 18715, 18715, 18715, 18715, 18715, 18715, 18715, 18715, 
18715, 18715, 18715, 18715, 18715, 18715, 18715, 18715, 18715, 
18715, 18715, 18715, 18715), class = "Date"), hora = c("00:00:00", 
"01:00:00", "02:00:00", "03:00:00", "04:00:00", "05:00:00", "06:00:00", 
"07:00:00", "08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", 
"13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00", 
"19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00", "00:00:00", 
"01:00:00", "02:00:00", "03:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", 
"14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00", "19:00:00", 
"20:00:00", "21:00:00", "22:00:00", "23:00:00", "00:00:00", "01:00:00", 
"02:00:00", "03:00:00", "04:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", 
"14:00:00", "15:00:00", "17:00:00", "18:00:00", "19:00:00", "20:00:00", 
"21:00:00", "22:00:00", "23:00:00", "00:00:00", "01:00:00", "02:00:00", 
"03:00:00", "04:00:00", "05:00:00", "06:00:00", "07:00:00", "08:00:00", 
"09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", "14:00:00", 
"15:00:00", "16:00:00", "17:00:00", "18:00:00", "19:00:00", "20:00:00", 
"21:00:00", "22:00:00", "01:00:00", "02:00:00", "03:00:00", "04:00:00", 
"05:00:00", "06:00:00", "07:00:00", "08:00:00", "09:00:00", "10:00:00", 
"11:00:00", "12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", 
"17:00:00", "18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", 
"23:00:00", "00:00:00", "01:00:00", "02:00:00", "03:00:00", "04:00:00", 
"05:00:00", "06:00:00", "07:00:00", "08:00:00", "09:00:00", "10:00:00", 
"11:00:00", "12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", 
"17:00:00", "18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", 
"23:00:00", "00:00:00", "02:00:00", "03:00:00", "04:00:00", "05:00:00", 
"06:00:00", "07:00:00", "08:00:00", "09:00:00", "10:00:00", "11:00:00", 
"12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", 
"18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00", 
"00:00:00", "01:00:00", "02:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", 
"14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00", "19:00:00", 
"20:00:00", "21:00:00", "22:00:00", "23:00:00", "00:00:00", "01:00:00", 
"02:00:00", "03:00:00", "04:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", 
"14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00", "19:00:00", 
"20:00:00", "21:00:00", "22:00:00", "23:00:00", "00:00:00", "01:00:00", 
"02:00:00", "03:00:00", "04:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", 
"14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00", "19:00:00", 
"20:00:00", "21:00:00", "22:00:00", "23:00:00"), a001 = c(18.4, 
17.7, 18.7, 17.6, 18.9, 18.5, 18, 17.4, 17.3, 17.3, 19.5, 20.6, 
21.7, 22.5, 24.9, 25.1, 26.2, 26.5, 27.3, 27.5, 26, 24.4, 23.8, 
22.8, 22, 21.7, 21.2, 20.6, 19.8, 19.4, 19, 17.6, 18.3, 18.7, 
19.5, 19.4, 20.6, 21.5, 23, 24.5, 25.6, 24.8, 25.9, 25.1, 22.8, 
22, 21, 20.5, 20.4, 20, 19.6, 19.3, 18.9, 18.6, 18.2, 17.8, 17.6, 
18.2, 18.5, 19.8, 21.4, 22.6, 22.6, 24.5, 25.4, 26, 25, 23.8, 
23, 22.3, 21.1, 21.1, 20.5, 19.9, 19.6, 17.7, 18.9, 19.2, 16.1, 
16, 17.7, 21.3, 22.4, 23.9, 24.6, 26.3, 26.9, 26.4, 26.9, 27.2, 
26.6, 24.5, 23.4, 21.5, 21.1, 19.2, 17.4, 19.8, 19.4, 18.3, 18.9, 
17.8, 19.4, 22.3, 23.5, 24.8, 25.2, 26.5, 26.9, 28.1, 27.9, 27.4, 
26.5, 25.3, 24, 22.8, 22.2, 21.4, 20.9, 18.8, 16.5, 16.7, 18.8, 
19.8, 20, 19.7, 20.4, 22, 23.4, 24.3, 25.3, 25.8, 27, 27.1, 26.9, 
26.6, 25.9, 24.7, 23.3, 22.5, 21.4, 20.4, 17, 16.3, 15.9, 15.4, 
14.7, 14.8, 14.5, 16, 20.6, 22.4, 24.3, 24.6, 25.1, 26.4, 26.1, 
25.5, 25.8, 25.1, 23, 20.9, 18.4, 18.5, 18.2, 17.7, 16.3, 16.3, 
14.5, 14.4, 17.8, 17.6, 22, 24, 24.8, 26.3, 26.9, 27.2, 27.5, 
28.2, 27.5, 27.2, 24.4, 21.7, 20.4, 22.5, 22.2, 22.3, 21.6, 17.3, 
16.9, 15.9, 15.5, 15, 14.8, 18.2, 22.6, 24.4, 25.5, 26.7, 27.3, 
27.6, 27.9, 28.5, 28.6, 28.3, 26.3, 23.7, 20.1, 19.8, 19.1, 18.4, 
18.2, 17.5, 17.1, 15.9, 15.7, 15.6, 15, 19, 24.5, 25.4, 26.8, 
27.5, 29.1, 28.6, 30, 29.9, 29.7, 29.2, 25.7, 21.6, 21.4), a046 = c(17.8, 
17.8, 17.9, 17.7, 17.7, 17.8, 18, 17.2, 17.2, 17.4, 17.6, 20.8, 
24.1, 25.3, 26.3, 27.1, 27.6, 28.5, 28.9, 27.6, 25.4, 25.9, 23.8, 
21.7, 22.5, 22.1, 21.2, 20.5, 20.9, 19.7, 18.7, 17.8, 18.9, 19.5, 
21.7, 22.1, 22.2, 23.5, 25.3, 26.6, 26.4, 27.2, 27.5, 25.6, 24.1, 
22, 20.2, 19.2, 18.3, 19.1, 20.6, 17.6, 18.4, 18.8, 18.7, 18.7, 
18.6, 19.9, 21.3, 21.6, 23.4, 24.8, 26.1, 26.3, 26.5, 27.2, 26.1, 
24.3, 21.6, 21.5, 20.1, 21.1, 21.2, 19.5, 20.2, 18.7, 17.3, 16.3, 
15.8, 14.9, 15.6, 19.1, 24.8, 25, 27.8, 26.9, 29.1, 28.3, 29.1, 
29.1, 27.4, 24.8, 22, 20.2, 19.7, 18.5, 18.7, 17.7, 18.8, 17.2, 
16, 15.7, 18.8, 22.4, 25.8, 26.5, 27.4, 28.4, 28.9, 29.6, 29.5, 
29.2, 27.2, 25, 22.8, 20.6, 20.3, 20.7, 18.5, 18.1, 18.9, 16.5, 
18.2, 16.6, 16.2, 16.1, 17.2, 21.2, 24.9, 26.5, 27, 27.8, 28.4, 
28.2, 28.6, 27.7, 26.6, 24.3, 20.3, 20.3, 20.2, 19.6, 19, 19, 
18.4, 18.5, 19, 18.9, 19.6, 19.6, 21.8, 24.3, 26.4, 27, 27.1, 
27.3, 28, 28.9, 27.8, 26.3, 23.7, 20.4, 19.1, 18.3, 17.2, 17.5, 
15.5, 15.9, 16, 16.8, 16.8, 17.8, 23.2, 25.9, 26.5, 27.7, 28.1, 
28.2, 30.2, 29.4, 27.8, 27.7, 26.2, 21.9, 21.8, 20, 18.6, 20.3, 
19, 17.1, 17.4, 18.6, 19, 19.7, 18.5, 20.2, 23.4, 26.2, 27.5, 
28.5, 29.1, 30.2, 30.3, 30.6, 30, 29.6, 25.9, 23.9, 22.3, 17.1, 
17.2, 16.3, 16.2, 15.3, 16, 16, 15.2, 15.4, 13.6, 17.2, 23.4, 
26, 28.7, 29.2, 30.2, 30.7, 30.9, 29.8, 28.3, 29.8, 26.2, 23, 
21)), row.names = c(241L, 242L, 243L, 244L, 245L, 246L, 247L, 
248L, 249L, 250L, 251L, 252L, 253L, 254L, 255L, 256L, 257L, 258L, 
259L, 260L, 261L, 262L, 263L, 264L, 265L, 266L, 267L, 268L, 270L, 
271L, 272L, 273L, 274L, 275L, 276L, 277L, 278L, 279L, 280L, 281L, 
282L, 283L, 284L, 285L, 286L, 287L, 288L, 289L, 290L, 291L, 292L, 
293L, 294L, 295L, 296L, 297L, 298L, 299L, 300L, 301L, 302L, 303L, 
304L, 306L, 307L, 308L, 309L, 310L, 311L, 312L, 313L, 314L, 315L, 
316L, 317L, 318L, 319L, 320L, 321L, 322L, 323L, 324L, 325L, 326L, 
327L, 328L, 329L, 330L, 331L, 332L, 333L, 334L, 335L, 338L, 339L, 
340L, 341L, 342L, 343L, 344L, 345L, 346L, 347L, 348L, 349L, 350L, 
351L, 352L, 353L, 354L, 355L, 356L, 357L, 358L, 359L, 360L, 361L, 
362L, 363L, 364L, 365L, 366L, 367L, 368L, 369L, 370L, 371L, 372L, 
373L, 374L, 375L, 376L, 377L, 378L, 379L, 380L, 381L, 382L, 383L, 
384L, 385L, 387L, 388L, 389L, 390L, 391L, 392L, 393L, 394L, 395L, 
396L, 397L, 398L, 399L, 400L, 401L, 402L, 403L, 404L, 405L, 406L, 
407L, 408L, 409L, 410L, 411L, 414L, 415L, 416L, 417L, 418L, 419L, 
420L, 421L, 422L, 423L, 424L, 425L, 426L, 427L, 428L, 429L, 430L, 
431L, 432L, 433L, 434L, 435L, 436L, 437L, 438L, 439L, 440L, 441L, 
442L, 443L, 444L, 445L, 446L, 447L, 448L, 449L, 450L, 451L, 452L, 
453L, 454L, 455L, 456L, 457L, 458L, 459L, 460L, 461L, 462L, 463L, 
464L, 465L, 466L, 467L, 468L, 469L, 470L, 471L, 472L, 473L, 474L, 
475L, 476L, 477L, 478L, 479L, 480L), class = "data.frame")

and

> dput(prev)
structure(list(data = structure(c(18696, 18696, 18696, 18696, 
18696, 18696, 18696, 18696, 18696, 18696, 18696, 18696, 18696, 
18696, 18696, 18696, 18696, 18696, 18696, 18696, 18696, 18697, 
18697, 18697, 18697, 18697, 18697, 18697, 18697, 18697, 18697, 
18697, 18697, 18697, 18697, 18697, 18697, 18697, 18697, 18697, 
18697, 18697, 18697, 18698, 18698, 18698, 18698, 18698, 18698, 
18698, 18698, 18698, 18698, 18698, 18698, 18698, 18698, 18698, 
18698, 18698, 18698, 18698, 18698, 18698, 18698, 18698, 18698, 
18699, 18699, 18699, 18699, 18699, 18699, 18699, 18699, 18699, 
18699, 18699, 18699, 18699, 18699, 18699, 18699, 18699, 18699, 
18699, 18699, 18699, 18699, 18700, 18700, 18700, 18700, 18700, 
18700, 18700, 18700, 18700, 18700, 18700), class = "Date"), hora = c("03:00:00", 
"04:00:00", "05:00:00", "06:00:00", "07:00:00", "08:00:00", "09:00:00", 
"10:00:00", "11:00:00", "12:00:00", "13:00:00", "14:00:00", "15:00:00", 
"16:00:00", "17:00:00", "18:00:00", "19:00:00", "20:00:00", "21:00:00", 
"22:00:00", "23:00:00", "00:00:00", "02:00:00", "04:00:00", "05:00:00", 
"06:00:00", "07:00:00", "08:00:00", "09:00:00", "10:00:00", "11:00:00", 
"12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", 
"18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00", 
"00:00:00", "01:00:00", "02:00:00", "03:00:00", "04:00:00", "05:00:00", 
"06:00:00", "07:00:00", "08:00:00", "09:00:00", "10:00:00", "11:00:00", 
"12:00:00", "13:00:00", "14:00:00", "15:00:00", "16:00:00", "17:00:00", 
"18:00:00", "19:00:00", "20:00:00", "21:00:00", "22:00:00", "23:00:00", 
"02:00:00", "03:00:00", "04:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00", "11:00:00", "12:00:00", "13:00:00", 
"14:00:00", "15:00:00", "16:00:00", "17:00:00", "18:00:00", "19:00:00", 
"20:00:00", "21:00:00", "22:00:00", "23:00:00", "00:00:00", "01:00:00", 
"02:00:00", "03:00:00", "04:00:00", "05:00:00", "06:00:00", "07:00:00", 
"08:00:00", "09:00:00", "10:00:00"), a001 = c(19.9, 19.2, 19.2, 
19.1, 18.9, 19, 18.9, 19.6, 19.4, 19.4, 19.5, 20.1, 21.9, 23.6, 
26, 24.6, 24.1, 22.6, 21.9, 20.4, 19.9, 19.5, 19, 18.7, 18.5, 
18.2, 17.7, 17.8, 18.1, 18.7, 20, 22.4, 23, 24.2, 26.3, 25.8, 
25.3, 25.1, 25.7, 24.4, 23.1, 22.1, 20.8, 20.2, 19.1, 18.7, 19.3, 
19.3, 19.7, 19.5, 18.6, 17.8, 18.3, 19.5, 20.4, 21.4, 23.6, 24.4, 
26.3, 25.6, 25.4, 19.4, 20.9, 21, 20.4, 19.8, 19.8, 19.2, 19.7, 
19.7, 19.3, 18.8, 18.7, 18.6, 19.3, 19.5, 21.3, 23.5, 24.3, 26, 
26, 22.7, 26, 26.9, 25.1, 24.2, 23.9, 23.2, 21.5, 21.9, 21.2, 
20.1, 19.7, 19.8, 19.5, 19.2, 19.3, 19, 18.8, 19.7), a046 = c(20.3, 
20.6, 20.7, 19.8, 20.1, 19.4, 19.9, 19.9, 20.3, 20.9, 21, 21.7, 
23.5, 24.4, 25, 25.3, 22.4, 22.9, 22.7, 21, 19.5, 20.1, 20, 19.9, 
19.2, 19.1, 19.2, 19, 18.2, 19.9, 22.2, 23.5, 25.2, 24.7, 25.8, 
25.9, 26.3, 27, 25.8, 24.7, 23.9, 22.4, 20.1, 19.5, 19.8, 19.5, 
19.5, 19.8, 18.8, 19.2, 19.3, 18.8, 18.9, 19.1, 20.8, 22, 24.5, 
24.9, 24.2, 25.2, 27.3, 26.6, 24.1, 23.3, 21.4, 21.2, 20.5, 19.4, 
19.6, 19.4, 20.6, 19.1, 19.2, 18.9, 18.5, 19.4, 21.2, 23, 26.8, 
26.3, 26.8, 24.3, 23.5, 23.7, 24.2, 23.7, 22.7, 22.2, 21.8, 21.6, 
21.5, 21.4, 20.6, 19.4, 19.2, 18.7, 19.1, 19.4, 19.4, 19.9)), row.names = c(4L, 
5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
19L, 20L, 21L, 22L, 23L, 24L, 25L, 27L, 29L, 30L, 31L, 32L, 33L, 
34L, 35L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 43L, 44L, 45L, 46L, 
47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L, 
60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 68L, 69L, 70L, 71L, 72L, 
75L, 76L, 77L, 78L, 79L, 80L, 81L, 82L, 83L, 84L, 85L, 86L, 87L, 
88L, 89L, 90L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 100L, 
101L, 102L, 103L, 104L, 105L, 106L, 107L), class = "data.frame")

CodePudding user response:

The predict() function looks for a variable in newdata= with the same name as that used in the regression model. You need to use the following form:

bsb_out_fit_mono <- lm(a001 ~ a046, tr_bsb_out)
prev.prd <- predict.lm(bsb_out_fit_mono, newdata = prev)

Now predict will use the a046 that is inside prev and not the one inside tr_bsb_out. The predict function automatically switches to using the original data when you do not provide a valid newdata argument. In your case the warning message indicated that newdata had only 100 rows (and no variable with the name a046 since prev$a046 is not the same as tr_bsb_out$a046) so the original data set with 233 rows was used. newdata = data.frame(a046=prev$a046) would also work, but it is more typing.

  • Related