Home > Mobile >  Correctly Understanding the Use of the replicate() function
Correctly Understanding the Use of the replicate() function

Time:06-20

I have the following function:

library(dplyr)

var_1 <- rnorm(100, 10, 10)
var_2 <- rnorm(100, 1, 10)
var_3 <- rnorm(100, 5, 10)
response <- rnorm(100, 1, 1)
my_data <- data.frame(var_1, var_2, var_3, response)
my_data$id <- 1:100

simulate <- function() {
  results <- list()
  results2 <- list()
  for (i in 1:100) {
    iteration_i <- i
    sample_i <- my_data[sample(nrow(my_data), 10), ]
    results_tmp <- data.frame(iteration_i, sample_i)
    results[[i]] <- results_tmp
  }
  results_df <- do.call(rbind.data.frame, results)
  test_1 <- data.frame(results_df %>% 
                         group_by(id) %>% 
                         filter(iteration_i == min(iteration_i)) %>% 
                         distinct)
  summary_file <- data.frame(test_1 %>% 
                               group_by(iteration_i) %>% 
                               summarise(Count=n()))
  cumulative <- cumsum(summary_file$Count)
  summary_file$Cumulative <- cumulative
  summary_file$unobserved <- 100 - cumulative
  return(summary_file)
}

When I call this function, I get the following output:

> head(simulate())
  iteration_i Count Cumulative unobserved
1           1    10         10         90
2           2     7         17         83
3           3    10         27         73
4           4     5         32         68
5           5     7         39         61
6           6     8         47         53

I want to try to run this function 10 times and append all the results into a single file. I tried to do this using the "replicate()" function - but this is not working:

# Method 1 : Did not work
    n_replicates = 10
iterations_required <- replicate(n_replicates, {
  simulate() 
})

# Method 2: Did not work
lapply(seq_len(10), simulate(1))

# Method 3: Did Not Work

library(purrr)
rerun(10, simulate(1))

# Method 4: Did Not Work

lapply(seq_len(10), simulate)

Ideally, I would like to get something like this:

# works fine!
results <- list()
for (i in 1:10) { 
  game_i <- i
  s_i <- simulate()
  results_tmp <- data.frame(game_i, s_i)
  results[[i]] <- results_tmp
}

final_file <- do.call(rbind.data.frame, results)

My Question: Is there a reason that "Method 1, Method 2, Method 3, Method 4" were not working - could someone please show me how to fix this?

CodePudding user response:

# Method 1 : 
    n_replicates = 10
    iterations_required <- replicate(n_replicates, {
        simulate() 
    }, simplify=FALSE)

# Method 2: 
iterations_required<-lapply(seq_len(10), function(x) simulate(1))

# Method 4: 

iterations_required<-lapply(seq_len(10), simulate)

# to merge into one data.frame
as.data.frame(data.table::rbindlist(iterations_required, idcol=TRUE))

Alternatively, if you modify your function to simulate(i), where i will be the first column in the output (interation index). Then you could use do.call(rbind.data.frame, lapply(seq_len(n_replicates), simulate))

CodePudding user response:

replicate by default tries to simplify the result in a matrix. So the trick is actually just not to simplify.

n_replicates<- 10
iterations_required <- replicate(n_replicates, simulate(), simplify=FALSE)
  •  Tags:  
  • r
  • Related