Suppose I have the following NumPy array:
import numpy as np
arr = np.array([['a', -1, -1, -1],
[ -1,'b','c', -1],
['e', -1,'d','f']], dtype=object)
Now I would like to find the column index distance of neighboring elements for each row if there is more than one non-minus-one element in one row.
For example, for pair ('b', 'c'), 'c' is in third column and 'b' is in the second column, the column difference is 2 (column index of 'c') - 1 = 1.
For ('e','d'), the distance would be 2 - 0 = 2. For ('d','f'), the distance would be 1. For ('e','f'), there is 'd' between them, so we do not consider it.
CodePudding user response:
I think this does what you want:
[np.diff(np.where(row!=-1)).flatten() for row in arr]
The result:
[array([], dtype=int64), array([1]), array([2, 1])]
I can't think of a way to vectorize it (i.e. to avoid the loop); it's kind of a weird data structure (NumPy arrays contain elements of a single type, so depending on what you want to do, you might find object
or Unicode more amenable).