There is a portion of my f90 program that is taking up a significant amount of compute time. I am basically looping through three matrices (of the same size, with dimensions as large as 250-by-250), and trying to make sure values stay bounded within the interval [-1.0, 1.0]. I know that it is best practice to avoid conditionals in loops, but I am having trouble figuring out how to re-write this block of code for optimal performance. Is there a way to "unravel" the loop or use a built-in function of some sort to "vectorize" the conditional statements?
do ind2 = 1, size(u_mat,2)
do ind1 = 1,size(u_mat,1)
! Dot product 1 must be bounded between [-1,1]
if (b1_dotProd(ind1,ind2) .GT. 1.0_dp) then
b1_dotProd(ind1,ind2) = 1.0_dp
else if (b1_dotProd(ind1,ind2) .LT. -1.0_dp) then
b1_dotProd(ind1,ind2) = -1.0_dp
end if
! Dot product 2 must be bounded between [-1,1]
if (b2_dotProd(ind1,ind2) .GT. 1.0_dp) then
b2_dotProd(ind1,ind2) = 1.0_dp
else if (b2_dotProd(ind1,ind2) .LT. -1.0_dp) then
b2_dotProd(ind1,ind2) = -1.0_dp
end if
! Dot product 3 must be bounded between [-1,1]
if (b3_dotProd(ind1,ind2) .GT. 1.0_dp) then
b3_dotProd(ind1,ind2) = 1.0_dp
else if (b3_dotProd(ind1,ind2) .LT. -1.0_dp) then
b3_dotProd(ind1,ind2) = -1.0_dp
end if
end do
end do
For what it's worth, I am compiling with ifort
.
CodePudding user response:
You can use the intrinsic min and max functions for this.
As they are both elemental, you can use them on the whole array, as
b1_dotProd = max(-1.0_dp, min(b1_dotProd, 1.0_dp))
It will depend on the compiler implementation of min
and max
as to whether or not this actually avoids branches and if it's actually any faster, but it is at least a lot more concise.