The following code runs 599 instances of bootstrapping using data stored in the dictionary data_rois
. data_rois
is a dictionary that includes many keys and each key is associated with an array of numeric values. This part of the code works fine when it is coded as below:
boot_i = []
for i in range(599):
boot = np.random.choice(data_rois["interoception"], size=N)
boot = np.mean(boot)
boot_i.append(boot)
Now, I would like to apply bootstrapping for many keys in the dictionary data_rois
. Therefore, I apply a for loop as below that aims to store the bootstrapping results in another dictionary called boot_rois = {}
. The code shown below aims to shorten the code above, since the code above would get really long if I had to repeat it many times for all keys in data_rois
.
rois = ["interoception", "extero", ...] # A long list of rois
boot_rois = {}
for roi in rois:
for i in range(599):
boot = np.random.choice(data_rois[roi], size=N)
boot = np.mean(boot)
boot_rois[roi] = roi
The problem: The code works. However, my code appears to ignore for i in range(599)
but only runs boot = np.random.choice(data_rois[roi], size=N)
one time instead of 599 times. What line of code is missing in the nested for loop so that it runs bootstrapping 599 times instead of 1 time?
Update: I specify my aim here. My aim is to compute the standard deviation (SD) for each roi, based on the 599 bootstraps.
Here is an updated code suggested by someone in this topic. I changed that code to compute the SD and the results look fine.
boot_rois = {}
for roi in rois:
last_boot = None
for i in range(599):
boot = np.random.choice(data_rois[roi], size=N)
boot = np.std(boot)
if(last_boot is not None):
boot = np.std([boot,last_boot])
boot_rois[roi] = boot
last_boot = boot_rois[roi]
CodePudding user response:
It is partially unclear what you are trying to do since you didn't provide a minimal reproducible example. I've filled in as best I can and I think this should still help you solve your problem. You may need to adjust the type of aggregation to fit your needs.
Your for loop is of course executing as many times as it says it is. I believe you forgot to set the value of boot_rois to boot, otherwise your code just regenerates the ROI list.
boot_rois[roi] = boot
A secondary problem with your code is that everytime you do your inner for loop, you are just overwriting the same key in your dictionary. You probably want to do something like this instead. It isn't completely clear what type of math you are trying to do, but assuming you want to calculate the average between your 599 random arrays in a rolling fashion you could do this:
import numpy as np
N=10
data_rois={"interoception" : [1,2], "extero" : [2,3] }
rois = data_rois.keys() # A long list of rois
boot_rois = {}
for roi in rois:
last_boot = None
for i in range(7):
boot = np.random.choice(data_rois[roi], size=N)
print(boot)
boot = np.mean(boot)
# During first iteration aggregator last_boot is None
if(last_boot is not None):
# Average with the last iteration and repeat
# This logic may need to be replaced with whatever math you are trying to do
boot = np.mean([boot,last_boot])
boot_rois[roi] = boot
last_boot = boot_rois[roi]
print(boot_rois)
Note: Doing it this way means that the average is not the average of each of the 7, if you want to do that you can store them in a sum variable and divide by the number of iterations you performed in the inner for loop. Mathematically, doing the mean multiple times is different than summing everything and dividing by the number of sums. Make sure your math is correct.
CodePudding user response:
What about using comprehension list instead of nested loops:
rois = ["interoception", "extero", ...] # A long list of rois
boot_rois = {}
for roi in rois:
# will execute np.random.choice 599 times and store these results in a list
rand_choices = [np.mean(np.random.choice(data_rois[roi], size=N)) for _ in range(599)]
# will calculate the standard deviation of those 599 results
boot_rois[roi] = np.std(rand_choices)
This will run np.random.choice
599 times and store these results in a list (I added np.mean(...)
so you can calculate the stdev on those 599 mean values), so you can run np.std
on that list, and store this final result in boot_rois[roi]
Below is runnable code I used for tests purpose. It generates 20 random numbers between 0 and 50, and calculates the stdev:
import random
import numpy as np
rand_ints = [random.randint(0, 50) for _ in range(20)]
print(rand_ints)
stdev = np.std(rand_ints)
print(stdev)
First execution:
[9, 44, 13, 0, 43, 12, 4, 40, 35, 38, 43, 0, 3, 38, 39, 45, 37, 14, 4, 21] 16.908281994336384
Second execution:
[2, 20, 17, 32, 0, 39, 23, 27, 24, 41, 8, 21, 2, 7, 21, 3, 27, 7, 15, 36] 12.531560158256433
In order to emulate calculating the stdev of means of samples, I tried this:
import random
import numpy as np
rand_ints = [np.mean([random.randint(0, 50) for _ in range(10)]) for _ in range(20)]
print(rand_ints)
stdev = np.std(rand_ints)
print(stdev)
Which gave me:
[25.8, 16.9, 27.6, 21.8, 20.6, 30.5, 19.4, 32.9, 27.8, 18.5, 24.5, 18.7, 23.1, 26.9, 30.6, 25.1, 24.9, 26.5, 21.8, 25.8] 4.2607833786758045
CodePudding user response:
I am not fully sure about the structure behind your dictionary. A fuller sample of code would be helpful to be sure. for instance what is the "rois" you are doing your first iteration on? you spoke of data_rois, and you created boot_rois. But rois itself is described nowhere.