Running analysis on for loop x times-CodePudding

I have the following code that selects 4 rows of iris 1000x, and takes the mean of each 4 row sample:

library(dplyr)

iris<- iris

storage<- list()


counter<- 0
for (i in 1:1000) {
  # sample 3 randomly selected transects 100 time
  tempsample<- iris[sample(1:nrow(iris), 4, replace=F),]

  storage[[i]]=tempsample

  
  counter<- counter 1
  print(counter)
}

# Unpack results into dataframe 
results<- do.call(rbind, storage)
View(results)

results_2<- as.data.frame(results)
results_2<- results_2 %>% mutate(Aggregate = rep(seq(1,ceiling(nrow(results_2)/4)),each = 4))
# View(results_2)


final_results<- aggregate(results_2[,1:4], list(results_2$Aggregate), mean)
# View(final_results)

I want to calculate the bias of each column in relation to their true population parameter. For example using SimDesign's bias():

library(SimDesign)
(bias(final_results[,2:5], parameter=c(5,3,2,1), type='relative'))*100

In this code, the values of parameter are hypothetical true pop. values of each column in the dataframe. I want to do this process 100x to get a distribution of bias estimates for each variable in the dataframe. However, I'm not sure how to fit all of this into a for loop (what I think would be the way to go) so the final output is a dataframe with 100 rows of bias measurements for each iris variable.

Any help with this would be greatly appreciated.

CodePudding user response：

Here is one way to do that. I've made some minor changes to your code, and wrapped it in a function. Then, use lapply over a sequence say 1:10 or 1:100, each time running your function, and feeding the result to your bias function from the SimDesign package. Then row bind the resulting list

library(dplyr)

get_samples <- function(df, size=4, n=1000) {

  storage<- list()
  counter<- 0
  
  for (i in 1:1000) {
    tempsample<- df[sample(1:nrow(df), size, replace=F),]
    storage[[i]]=tempsample
    counter<- counter 1
  }
  
  results<- do.call(rbind, storage)
  results_2<- as.data.frame(results)
  results_2<- results_2 %>% mutate(Aggregate = rep(seq(1,ceiling(nrow(results_2)/size)),each = size))
  final_results<- aggregate(results_2[,1:size], list(results_2$Aggregate), mean)
  return(final_results)
}


iris=iris

replicates = lapply(1:10, function(x) {
  result = get_samples(iris)
  (bias(result[,2:5], parameter=c(5,3,2,1), type='relative'))*100
})

replicates = do.call(rbind, replicates)

Output:

      Sepal.Length Sepal.Width Petal.Length Petal.Width
 [1,]     41.50617    3.292500     86.77408    8.859333
 [2,]     43.26058    2.763500     90.20758   10.825917
 [3,]     43.46642    3.551750     90.11767   10.576250
 [4,]     41.94683    2.970833     86.89625    8.817000
 [5,]     42.08733    3.380917     86.78642    8.996667
 [6,]     42.13050    2.942250     88.02983    9.707500
 [7,]     43.07383    2.775500     89.04583   10.102083
 [8,]     44.10192    2.895167     91.27208   11.188500
 [9,]     41.29408    2.314750     87.59208    9.244333
[10,]     42.77450    2.781583     90.37342   10.789500