Use for-loop and if function to create a new vector?-CodePudding

I want to do the following operation with the code: I want to get a sample of n = 30 out of a given normal distribution and calculate the mean of each sample. (until this step my function works without any problem). After that I want to create a new vector with yes or no , dependent on if the mean is in a certain range or not. Sadly the code does notconduct this step. I always get a vector with 13 elements,but there should be 500. What is the problem? Where is my mistake?

o = 13
u = 7
d = c()
for (i in 1:500){
  i = rnorm(30,mean = 10,sd = 6.04)
  i = mean(i)
  if (i <= o & i >=u) {
    d[i]=("Yes")
  } else {
    d[i]=("No")
  }
}

CodePudding user response：

You should avoid changing the value of your iterator (i) within your loop. In your case, your i is becoming a non-integer value. When you try to index your d vector, it takes the integer portion of i.

Consider what happens when I have a vector

x <- 1:4

and I take the pi index of it.

x[pi]
# [1] 3

Your code should look more like this:

o = 13
u = 7

d = c()

for (i in 1:500){
  sample_i = rnorm(30, mean = 10, sd = 6.04)
  mean_i = mean(sample_i)
  if (mean_i <= o & mean_i >=u) {
    d[i]=("Yes")
  } else {
    d[i]=("No")
  }
}

If you would like to improve your code some, here are some suggestions:

First, avoid "growing" your results. This has performance implications. It is better to decide how long your result (d) should be and set it to that length to begin with.

Next, try not to hard code the number of iterations into your loop. Get familiar with seq_along and seq_len and use them to count iterations for you.

o = 13
u = 7

d = numeric(500)    #  I made a change here

for (i in seq_along(d)){  # And I made a change here
  sample_i = rnorm(30, mean = 10, sd = 6.04)
  mean_i = mean(sample_i)
  if (mean_i <= o & mean_i >=u) {
    d[i]=("Yes")
  } else {
    d[i]=("No")
  }
}

CodePudding user response：

Re-assigning i looks like a bad idea to me.

Are you sure you want to do this in a for loop? If not, a vectorised solution with crossing (tidyverse - nice explanations at varianceexplained.org ) should work pretty nicely, I think?

o = 13
u = 7

crossing(trial = 1:500,
         rounds = 1:30)%>%
  mutate(num = rnorm(n(), mean = 10,  sd = 6.04))%>%
  group_by(trial)%>%
  summarise(mean = mean(num))%>%
  mutate(d = case_when(mean <= o & mean >= u ~ "Yes",
                       TRUE ~ "No"))%>%
  count(d)