I want to create a sparse matrix for based on a n*n distance matrix keeping the k smallest values in the distance matrix for each row. I have got the correct indices back by using np.argpartition
, but when I try and create a mask from this all it does is select the diagonal as True and everything else as false.
nn_indices = np.argpartition(data, k - 1)[:, :k]
mask = np.isin(data, nn_indices)
Any idea how I use the output from argpartition
to create a boolean mask for those indices?
e.g. - for n = 4, k = 2
[[4, 6, 1, 3]
[1, 5, 6, 7]
[4, 7, 2, 3]
[7, 1, 8, 2]]
Argpartition output:
[[2, 3]
[0, 1]
[2, 3]
[1, 3]]
Desired output:
[[0, 0, 1, 3]
[1, 5, 0, 0]
[0, 0, 2, 3]
[0, 1, 0, 2]]
I have had a look at scipy.csr_matrix
but can't get my head around how to order the column and row data.
Any help would be appreciated!
CodePudding user response:
Allocate an array of all zeros and then simply fill it:
>>> mask = np.zeros(data.shape, bool)
>>> mask[np.arange(len(data))[:, None], nn_indices] = True
>>> mask
array([[False, False, True, True],
[ True, True, False, False],
[False, False, True, True],
[False, True, False, True]])
>>> np.where(mask, data, 0)
array([[0, 0, 1, 3],
[1, 5, 0, 0],
[0, 0, 2, 3],
[0, 1, 0, 2]])