I have 2d numpy array of shape (15077, 5). All the values are less than or equal to 1.0. I'm essentially trying to do the following:
product = array.prod(axis=0)
product = product / product.sum()
So basically I want to return an array that represents the product of each column in the 2d array. The above code works fine for smaller inputs. But what I'm dealing with now has underflow and I'm ending up with a resulting array of all 0s. I've verified there are no 0s in the input array.
I've tried using the longdouble type and still seem to have the problem. I've tried to figure out ways of normalizing such as this:
results = np.ones(len(array[0]))
multiplier = 1
for row in array:
results = results * (row * multiplier)
while results.max() > 1:
results = results / 2
multiplier = multiplier / 2
while results.max() > 0 and results.max() < 1:
results = results * 2
multiplier = multiplier * 2
return results / results.sum()
While the above code does end up returning an array that isn't all zeros, I'm not convinced it's doing the correct thing. One of the elements is 0. I'm unsure if that's because the algorithm is wrong or if it's because there's so much difference between that column and the other columns that it's
Is there a way to do this that correctly accounts for overflow and underflow?
CodePudding user response:
Assuming each number is positive, you can transform each array value to its logarithm and then sum them. This is because
log(x) log(y) = log(xy)
So, I would do something like the following:
result = np.log(array).sum() - np.log(array.sum())
which would give you the logarithm of the desired calculation. You could then perform exponentiation to get the desired result, though that number may also be vanishingly small due to precision issues.