How do I save the results of this for loop as a vector rather than as a single value?-CodePudding

I am having trouble saving the results of a for loop in the way that I want.

The loop I'm currently running looks like this:

# Setup objects
n = 100
R = (1:1000)
P = seq(-.9, .9, .1)
betahat_OLS = rep(NA, 1000)
Bhat_OLS = rep(NA, 19)

# Calculate betahat_OLS for each p in P and each r in R
for (p in P) {
  for (r in R) {
    # Simulate data
    v = rnorm(n)
    e = rnorm(n)
    z = rnorm(n)
    u = p*v e
    x = z v
    y = 0*x u
    #Calculate betahat_OLS
    betahat_OLS[r] = sum(x*y)/sum(x^2)
  }
  #Calculate Bhat_OLS
  Bhat_OLS = sum(betahat_OLS)/1000-0
}

# Make a scatterplot with p on the x-axis and Bhat_OLS on the y-axis
plot(P, Bhat_OLS)

The loop seems to be working correctly, except for the fact that I would like to end up with 19 values of Bhat_OLS and only currently get 1 value. I want to have a Bhat_OLS value for each value of p in P so that I can plot Bhat_OLS against p.

CodePudding user response：

You can write your results into a data frame with two columns, containing P and Bhat_OLS.

# Setup objects
n = 100
R = (1:1000)
P = seq(-.9, .9, .1)
betahat_OLS = rep(NA, 1000)
Bhat_OLS = rep(NA, 19)

# initialize result data frame
results <- data.frame(matrix(ncol = 2, nrow = 0, 
                      dimnames = list(NULL, c("P", "Bhat_OLS"))))

# Calculate betahat_OLS for each p in P and each r in R
for (p in P) {
    for (r in R) {
        # Simulate data
        v = rnorm(n)
        e = rnorm(n)
        z = rnorm(n)
        u = p*v e
        x = z v
        y = 0*x u
        #Calculate betahat_OLS
        betahat_OLS[r] = sum(x*y)/sum(x^2)
    }
    #Calculate Bhat_OLS
    Bhat_OLS = sum(betahat_OLS)/1000-0
    
    # insert P and Bhat_OLS into results
    results[nrow(results)   1,] = c(p, Bhat_OLS)
}

# Make a scatterplot with p on the x-axis and Bhat_OLS on the y-axis
plot(results$P, results$Bhat_OLS)

CodePudding user response：

The fact that you loop over the probabilities makes it difficult with the indices. You could loop over seq(P) instead and subset P[i]. Also, at the end you need Bhat_OLS[i]. Then it works.

# Setup objects
n <- 100
R <- (1:1000)
P <- seq(-.9, .9, .1)
betahat_OLS <- rep(NA, length(R))
Bhat_OLS <- rep(NA, length(P))

set.seed(42)  ## for sake of reproducibility

# Calculate betahat_OLS for each p in P and each r in R
for (i in seq(P)) {
  for (r in R) {
    # Simulate data
    v <- rnorm(n)
    e <- rnorm(n)
    z <- rnorm(n)
    u <- P[i]*v   e
    x <- z   v
    y <- 0*x   u
    #Calculate betahat_OLS
    betahat_OLS[r] <- sum(x*y)/sum(x^2)
  }
  #Calculate Bhat_OLS
  Bhat_OLS[i] <- sum(betahat_OLS)/1000 - 0
}

# Make a scatterplot with p on the x-axis and Bhat_OLS on the y-axis
plot(P, Bhat_OLS, xlim=c(-1, 1))

Alternative solution `vapply`

In a more R-ish way (right now it is more c-ish) you could define the simulation in a function sim() and use vapply for the outer loop. (Actually also for the inner loop, but I've tested it and this way it's faster.)

sim <- \(p, n=100, R=1:1000) {
  r <- rep(NA, max(R))
  for (i in R) {
    v <- rnorm(n)
    e <- rnorm(n)
    z <- rnorm(n)
    u <- p*v   e
    x <- z   v
    y <- 0*x   u
    r[i] <- sum(x*y)/sum(x^2)
  }
  return(sum(r/1000 - 0))
}

set.seed(42)
Bhat_OLS1 <- vapply(seq(-.9, .9, .1), \(p) sim(p), 0)

stopifnot(all.equal(Bhat_OLS, Bhat_OLS1))

Note:

R.version.string
# [1] "R version 4.1.2 (2021-11-01)"

Alternative solution vapply

Alternative solution `vapply`