Home > OS >  Storing model predictions in a multi-dimensional array in R, indexing changes structure
Storing model predictions in a multi-dimensional array in R, indexing changes structure

Time:10-11

I am fitting a large number of models and making predictions from them to plot. To me the most efficient way of doing this should be to create a 3D array and overwrite each matrix with the corresponding predictions. However, when I use bracket indexing to overwrite the array the array changes form.

library(tidyverse)

# The data
dat <- data.frame(x=seq(0,10,1),
                  y=seq(0,5,0.5)^2,
                  y2=c(12,7,13,14,18,15,19,23,25,23,24))
# a look at the data
ggplot(dat)  
  geom_point(aes(x=x,y=y),color='red')  
  geom_point(aes(x=x,y=y2),color='blue')

# Fit some models
mod.list <- list()

mod.list[[1]] <- glm(y~x I(x^2), 
                     data = dat,
                     family = gaussian()) 
mod.list[[2]] <- glm(y2~x I(x^2), 
                     data = dat,
                     family = gaussian()) 
# make predictions
new = data.frame(x = seq(0,10,1)) # data to predict on
# Create array to hold predictions
all.preds <- array(data = 0,
                   dim = c(10,3,2))
# Overwrite prediction array per Ritchie's comment
all.preds <- array(data = 0,
                   dim = c(NROW(new),3,2))
dimnames(all.preds)[[2]] <- c('x','fit_link','se_link')

for (i in 1:2) {
  preds <-  bind_cols(new, setNames(as_tibble(predict(mod.list[[i]], 
                                                      newdata = new, 
                                                      se.fit = TRUE)[1:2]), # return standard errors
                                    c('fit_link','se_link'))) 
  all.preds[ , ,i] <- preds
}

This seems like it should be relatively straightforward, however, I cannot locate others who have done similar operations. Certainly, there must be something I am missing. This is the closest I found but I think it's easier to do in a for loop? R array indexing for multi-dimensional arrays

EDIT: I found I can do this with lists...still maybe not the best?

all.preds <- list()
# make predictions
new = data.frame(x = seq(0,10,1))


for (i in 1:2) {
  all.preds[[i]] <-  bind_cols(new, setNames(as_tibble(predict(mod.list[[i]], 
                                                      newdata = new, 
                                                      se.fit = TRUE)[1:2]), # return standard errors
                                    c('fit_link','se_link'))) 
  # all.preds[ , ,i] <- preds
  
}

CodePudding user response:

Here is a solution. I don't load package tidyverse since the code is base R only. The new data new is corrected to have only 10 rows.

# Create array to hold predictions
all.preds <- array(data = 0, dim = c(10,3,2))
dimnames(all.preds)[[2]] <- c('x','fit_link','se_link')

# make predictions
new = data.frame(x = seq(1, 10, 1))

for(i in seq_along(mod.list)) {
  preds <- predict(mod.list[[i]], newdata = new, se.fit = TRUE)[1:2]
  all.preds[, 1, i] <- new[["x"]]
  all.preds[ , 2:3, i] <- matrix(unlist(preds), ncol = 2)
}
all.preds
#> , , 1
#> 
#>        x fit_link      se_link
#>  [1,]  1     0.25 9.671887e-16
#>  [2,]  2     1.00 7.645017e-16
#>  [3,]  3     2.25 7.541841e-16
#>  [4,]  4     4.00 8.083363e-16
#>  [5,]  5     6.25 8.350338e-16
#>  [6,]  6     9.00 8.083363e-16
#>  [7,]  7    12.25 7.541841e-16
#>  [8,]  8    16.00 7.645017e-16
#>  [9,]  9    20.25 9.671887e-16
#> [10,] 10    25.00 1.396718e-15
#> 
#> , , 2
#> 
#>        x fit_link  se_link
#>  [1,]  1 10.87552 1.303368
#>  [2,]  2 12.69044 1.030230
#>  [3,]  3 14.45175 1.016326
#>  [4,]  4 16.15944 1.089301
#>  [5,]  5 17.81352 1.125278
#>  [6,]  6 19.41399 1.089301
#>  [7,]  7 20.96084 1.016326
#>  [8,]  8 22.45408 1.030230
#>  [9,]  9 23.89371 1.303368
#> [10,] 10 25.27972 1.882195

Created on 2022-10-11 with reprex v2.0.2

  • Related