I have a 2D matrix, such as:
import numpy as np
arr = np.array([[0,0,1],[0,1,0], [0,1,0], [0,0,1]])
Where arr
is:
[[0 0 1]
[0 1 0]
[0 1 0]
[0 0 1]]
I need to find the largest row number in each column where the value is 1. Or if there is none, then I would need to know that as well.
In the above example, through some array or encoding, I would need to know that:
- Column 0 : no match
- Column 1 : Row 2
- Column 2 : Row 3
I have done:
result = np.where(arr == 1)
Where result
is:
(array([0, 1, 2, 3], dtype=int64), array([2, 1, 1, 2], dtype=int64))
The first array indicates the row
and the second array indicates the column
.
So, this tells me that:
column 2
has a1
atrow 0
androw 3
column 1
has a1
atrow 1
androw 2
I can also infer that column 0 has no 1's so that is fine.
So, I would need a way to find the largest value (row number) in array 0
corresponding to unique values in array 1
.
So, an example of what I would need to return is:
(array([2,3], dtype=int64), array([1,2], dtype=int64))
or something like that, where I can know that column 1 had a value of 1 occurring at a max row of 2. Column 2 had a value of 1 occurring at a max row of 3 etc. The first array in that example is the maximum row number corresponding (by index) to the second array which indicates the column.
CodePudding user response:
The columns where array have 1
in it can be achieved by np.where
and np.any()
. Then we can mask columns and rows where 1
existed. So, by reversing the masked array as below you can get the result:
cols = np.where((arr == 1).any(0))
mask = (arr == 1)[:, cols[0]]
rows = mask.shape[0] - np.argmax(mask[::-1, :], axis=0) - 1
# cols --> [1 2]
# rows --> [2 3]
CodePudding user response:
This is what I came up with. Does the job but not in a true numpy
way. Hopefully there is a more efficient way without loops and something that utilizes more numpy
tricks.
import numpy as np
arr = np.array([[0,0,1],[0,1,0], [0,1,0], [0,0,1]])
result = np.where(arr == 1)
columns = result[1]
rows = result[0]
num_columns = 464
max_row_value = np.array([], dtype=np.ushort)
corresponding_column = np.array([], dtype=np.ushort)
for i in range(num_columns):
indexes = np.where(columns == i)
indexes = indexes[0]
if len(indexes):
max_value = np.max(rows[indexes])
max_row_value = np.append(max_row_value, max_value)
corresponding_column = np.append(corresponding_column,i)
print(max_row_value)
print(corresponding_column)
Output:
[2 3]
[1 2]
Where the top array shows the max row value. The bottom row is the column to which the max row value corresponds.
It matches what I stated with:
- Column 0 : no match
- Column 1 : Row 2
- Column 2 : Row 3