Home > Software engineering >  random samples and distributions in R
random samples and distributions in R

Time:11-26

I'm trying to build an estimator to compare the asymptotics of the GLS and OLS estimators. My idea is to try and see what happens at large samples, and with many of them.

Ideally, I would like to create a loop that would generate 6000 different random samples, of sizes 50 and 100 each, for different parameter values.

N=1000
n=c(50, 100)

#parameters
alpha0=1
beta0=1
gamma0=c(0, 0.1, 0.5)

alpha1=matrix(NA,N,6)
beta1=matrix(NA,N,6)
alpha2=matrix(NA,N,6)
beta2=matrix(NA,N,6)
alphaOLS=matrix(NA,N,6)
betaOLS=matrix(NA,N,6)

the different samples come from the combinations of gamma0 and n, which would equal 6 (times N) to get 6000. My first idea was to build a loop for the generation of the random samples the model I'm trying to work with is the following y_i=alpha beta*x_i u_i

u_i=e_i*h(x_i)^(1/2) and h(x)=exp(gamma0)

u <- list()


for (i in n) {
  for (k in gamma0) {
    x=rnorm(i,0,1)
    h=exp(gamma0[k]*x)
    e=rnorm(i,0,1)
    u[[i]] <- e*h^(1/2)
    
  }
  
}

The issue with this loop is that I'm only getting one random sample in x and e, and h is coming out as an empty matrix, and hence, u is also coming out empty. h here should be a matrix where the columns correspond to x* the different values of gamma0. e is supposed to be N(0,1) and u is meant to be the residual of the model

My ideal output should be get this loop to work, because from there on, I can sort my way around building an OLS and GLS estimator manually.

Thanks a lot!

CodePudding user response:

This should work, here we use directly i and k instead of n and gamma0 in the loop.

### parameters
N <- 1000
n <- c(50, 100)

alpha0 <- 1
beta0 <- 1
gamma0 <- c(0, 0.1, 0.5)

### Initiating objects for the loop
u <- list()
num_iter <- 0

### Looping
for (i in n) {
  for (k in gamma0) {
    
    num_iter <- num_iter   1
    
    x <- rnorm(i, 0, 1)
    h <- exp(k * x)
    
    e <- rnorm(i, 0, 1)
    u[[num_iter]] <- e * h^(1/2)
    names(u)[num_iter] <- paste("n:", i, ", gamma:", k, sep="", collapse=" ")
    
  }
}

### Display results
u

CodePudding user response:

Alternatively, avoid the loop altogether by using R's inherent vectorisation and the tidyverse's ability to process groups. A good rule of thumb when using R is "if I'm thinking of using a for loop, there's probably a better way to do it"...

Here, I only do OLS for simplicity, but the extension to use both methods should be straightforward. You can wrap the code in a function and provide parameters for, say, N, n and gamma0 to provide even more generality.

library(tidyverse)
library(broom)

# Create a tibble containing all combinations of the various design parameters
tibble() %>% 
       expand(
         # Sample=1:1000,
         # For speed
         Sample=1:10,
         Gamma0=c(0, 0.1, 0.5),
         n=c(50, 100),
         alpha0=1,
         beta0=1
       ) %>% 
       #  For each combination of design parameters...
       group_by(Gamma0, n, Sample, alpha0, beta0) %>% 
       group_map(
         function(.x, .y) {
           #  ... create some data.
           df <- .x %>% 
                    expand(
                      nesting(Gamma0, n, Sample, alpha0, beta0), 
                      ID=1:n
                    ) %>% 
                    mutate(
                      X=rnorm(n),
                      H=Gamma0 * X,
                      E=rnorm(n),
                      U=E*H^(1/2),
                      Y=alpha0   beta0 * X   U
                    )
           # And analyse it. For example
           glm(Y ~ X, data=df) %>% 
             tidy() %>% 
             bind_cols(.y) %>% 
             add_column(model="OLS")
         },
         .keep=TRUE
       ) %>% 
       # Combine all analyses of each individual sample into a single tibble
       bind_rows()
# A tibble: 120 × 11
   term        estimate std.error statistic p.value Gamma0     n Sample alpha0 beta0 model
   <chr>          <dbl>     <dbl>     <dbl>   <dbl>  <dbl> <dbl>  <int>  <dbl> <dbl> <chr>
 1 (Intercept)     1     7.39e-17   1.35e16       0      0    50      1      1     1 OLS  
 2 X               1     6.79e-17   1.47e16       0      0    50      1      1     1 OLS  
 3 (Intercept)     1     0        Inf             0      0    50      2      1     1 OLS  
 4 X               1     0        Inf             0      0    50      2      1     1 OLS  
 5 (Intercept)     1.00  8.55e-17   1.17e16       0      0    50      3      1     1 OLS  
 6 X               1     8.37e-17   1.19e16       0      0    50      3      1     1 OLS  
 7 (Intercept)     1     4.00e-17   2.50e16       0      0    50      4      1     1 OLS  
 8 X               1     3.56e-17   2.81e16       0      0    50      4      1     1 OLS  
 9 (Intercept)     1     4.22e-17   2.37e16       0      0    50      5      1     1 OLS  
10 X               1     4.40e-17   2.27e16       0      0    50      5      1     1 OLS  
# … with 110 more rows
  • Related