Home > Software design >  How to find the maximum value in one array, which corresponds to unique values/points in another arr
How to find the maximum value in one array, which corresponds to unique values/points in another arr

Time:04-08

I have a 2D matrix, such as:

import numpy as np
arr = np.array([[0,0,1],[0,1,0], [0,1,0], [0,0,1]])

Where arr is:

[[0 0 1]
 [0 1 0]
 [0 1 0]
 [0 0 1]]

I need to find the largest row number in each column where the value is 1. Or if there is none, then I would need to know that as well.

In the above example, through some array or encoding, I would need to know that:

 - Column 0 : no match
 - Column 1 : Row 2
 - Column 2 : Row 3

I have done:

result = np.where(arr == 1)

Where result is:

(array([0, 1, 2, 3], dtype=int64), array([2, 1, 1, 2], dtype=int64))

The first array indicates the row and the second array indicates the column.

So, this tells me that:

  • column 2 has a 1 at row 0 and row 3
  • column 1 has a 1 at row 1 and row 2

I can also infer that column 0 has no 1's so that is fine.

So, I would need a way to find the largest value (row number) in array 0 corresponding to unique values in array 1.

So, an example of what I would need to return is:

 (array([2,3], dtype=int64), array([1,2], dtype=int64))

or something like that, where I can know that column 1 had a value of 1 occurring at a max row of 2. Column 2 had a value of 1 occurring at a max row of 3 etc. The first array in that example is the maximum row number corresponding (by index) to the second array which indicates the column.

CodePudding user response:

The columns where array have 1 in it can be achieved by np.where and np.any(). Then we can mask columns and rows where 1 existed. So, by reversing the masked array as below you can get the result:

cols = np.where((arr == 1).any(0))
mask = (arr == 1)[:, cols[0]]
rows = mask.shape[0] - np.argmax(mask[::-1, :], axis=0) - 1

# cols --> [1 2]
# rows --> [2 3]

CodePudding user response:

This is what I came up with. Does the job but not in a true numpy way. Hopefully there is a more efficient way without loops and something that utilizes more numpy tricks.

import numpy as np 
arr = np.array([[0,0,1],[0,1,0], [0,1,0], [0,0,1]])

result = np.where(arr == 1)
columns = result[1]
rows = result[0]
num_columns = 464


max_row_value = np.array([], dtype=np.ushort)
corresponding_column = np.array([], dtype=np.ushort)
for i in range(num_columns):
    indexes = np.where(columns == i)
    indexes = indexes[0]
    if len(indexes):
        max_value = np.max(rows[indexes])
        max_row_value = np.append(max_row_value, max_value)
        corresponding_column = np.append(corresponding_column,i)

print(max_row_value)  
print(corresponding_column)
   

Output:

[2 3]
[1 2]

Where the top array shows the max row value. The bottom row is the column to which the max row value corresponds.

It matches what I stated with:

 - Column 0 : no match
 - Column 1 : Row 2
 - Column 2 : Row 3
  • Related