I have two 2d arrays, one containing float values, one containing bool. I want to create an array containing the mean values of the first matrix for each column considering only the values corresponding to False in the second matrix.
For example:
A = [[1 3 5]
[2 4 6]
[3 1 0]]
B = [[True False False]
[False False False]
[True True False]]
result = [2, 3.5, 3.67]
CodePudding user response:
Where B
is False, keep the value of A, make it NaN
otherwise and then use the nanmean
function which ignores NaN
's for operations.
np.nanmean(np.where(~B, A, np.nan), axis=0)
>>> array([2. , 3.5 , 3.66666667])
CodePudding user response:
Using numpy.mean using where argument to specify elements to include in the mean.
np.mean(A, where = ~B, axis = 0)
>>> [2. 3.5 3.66666667]
CodePudding user response:
A = [[1, 3, 5],
[2, 4, 6],
[3, 1, 0]]
B = [[True, False, False],
[False, False, False],
[True, True, False]]
sums = [0]*len(A[0])
amounts = [0]*len(A[0])
for i in range(0, len(A)):
for j in range(0, len(A[0])):
sums[j] = sums[j] (A[i][j] if not B[i][j] else 0)
amounts[j] = amounts[j] (1 if not B[i][j] else 0)
result = [sums[i]/amounts[i] for i in range(0, len(sums))]
print(result)
CodePudding user response:
There may be some fancy numpy
trick for this, but I think using a list comprehension to construct a new array is the most straightforward.
result = np.array([a_col[~b_col].mean() for a_col, b_col in zip(A.T,B.T)])
To follow better, this is what the line does expanded out:
result=[]
for i in range(len(A)):
new_col = A[:,i][~B[:,i]]
result.append(new_col.mean())
CodePudding user response:
You could also use a masked array:
import numpy as np
result = np.ma.array(A, mask=B).mean(axis=0).filled(fill_value=0)
# Output:
# array([2. , 3.5 , 3.66666667])
which has the advantage of being able to supply a fill_value
for when every element in some column in B
is True
.