I'm trying to parallelize this function by using @async @sync
function INSR_opt(f)
function INSR0_opt(seq)
len = length(seq)
res = seq[end]
@inbounds @sync for i in range(len-2,step=-1,stop=0)
@async res = f([seq[i 1], res])
end
return res
end
return INSR0_opt
end
The way I'm using the macros seems correct to me but the performance just gets worse Without the macros:
122.962 μs (1073 allocations: 69.00 KiB)
With the macros:
154.681 μs (1091 allocations: 69.95 KiB)
I've even tried using @spawn instead of @async but the performance still won't improve. I've checked the number of threads running with Threads.nthreads()
and they are 4
CodePudding user response:
Your code is sequential, as you have a recursive dependence in res
- so trying to parallelize it is both not possible and can lead to incorrect results. Essentially your code tries to re-implement foldr
in a less efficient and non-generic way:
julia> INSR_opt(((a, b),) -> a => b)(1:4)
1 => (2 => (3 => 4))
julia> foldr(=>, 1:4)
1 => (2 => (3 => 4))