I have a question regarding multithreading in Julia and how to parallelize a for loop effectively.
Suppose you have a nested for loop and a computer with 4 cores. A straightforward way is to add Threads.@threads
in front of the for loop. Assuming that the cores can run what they need to do without interference.
As I have understood this, only the outermost part of the nested for loop is parallelized. Assuming that N = 15 and M = 14 then a computer with 4 cores would be a bottleneck.
However, if you have a PC with 32 cores, then 32-15= 17 cores would be doing nothing. However, there would be 210 combinations in total to compute.
Is this correct? Is this how Threads.@threads
work? Is there a way to parallelize the combination of both i and j. Perhaps using FLoops? I have tried to read the documentation, however, I need to know if I am going in a completely wrong direction.
Threads.@threads for i in 1:N
for j in 1:M
# Do stuff
end
end
vs.
using FLoops
@floops for i in 1:N
for j in 1:M
# Do stuff
end
end
Thanks in advance
CodePudding user response:
you could probably have a third variable that you can divide into the two variables.
Threads.@threads for k in 1:(N*M)
j = k % M
i = k ÷ M
alternatively using itertools.product will assign both i and j without the two extra lines.
@floop for (i,j) in product(1:N,1:M)