Vector to an array of vectors of neighbours-CodePudding

I'd like to take a vector and get an array of vectors in which the i-th element of each vector are the k neighbors of the i-th element of the original vector. Also, I'm looking for the fastest way to do so.

I've already done that in MATLAB:

a=zeros(k, length(v));   
I=cell(1,k);

a(1,:) = v;

for j=2:k
    a(k,:)=[a(k-1,2:end),a(k-1,1)];
end

aux1=[a(:,(end-r 1):end),a(:,1:(end-r))];

for j=1:k
    I{k}=aux1(k,:);
end

For example, v = [1, 2, 3, 4, 5] and k = 1; and I want to get:

M = [[5, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 1]]

so that, for the 1st element of each vector, I get [5; 1; 2], which are the element 1 and its neighbors.

Hope it makes sense. Thanks for reading :)

CodePudding user response：

You could use the numpy roll function:

import numpy as np

def get_neighbors(v, k):
    N = len(v)
    M = np.zeros((k*2 1, N), dtype=int)
    for i in range(-k, k 1):
        M[i k, :] = np.roll(v, -i)
    return M

v = np.array([1, 2, 3, 4, 5])
k = 1
M = get_neighbors(v, k)
print(M)

Output:

[[5 1 2 3 4]
 [1 2 3 4 5]
 [2 3 4 5 1]]

CodePudding user response：

Using sliding_window_view on a repetition of your array can do it "vectorized" way

# Example array
a = np.arange(1,16)
k = 2 # Window of neighbors

# My solution
np.lib.stride_tricks.sliding_window_view(np.hstack([a,a,a]), (len(a),))[len(a)-k:len(a) k 1]

Returns

array([[14, 15,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13],
       [15,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14],
       [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15],
       [ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,  1],
       [ 3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,  1,  2]])

Note that sliding_window_view creates just a view. It doesn't create new data. Hence the reason why I do not hesitate creating (in this example) 31 lines (3*15-15 1), and then subset only 5 of them: I do not really create them. So only real cost of that solution is in hstack, both cpu-wise and memory-wise.

That subset, btw, was done to abide strictly by what you asked. But, depending on what you intend to do, you may drop the subset. Important point is that if

T=np.lib.stride_tricks.sliding_window_view(np.hstack([a,a,a]), (len(a),))

Then T[len(a) k] is a row made of the kth neighbor, whether k is positive, negative or 0 (the original row)

See timings, since it matters for you

sizes	This method	Roll method
len=15/k=2	51 μs	132 μs
len=15/k=7	51 μs	383 μs
len=1000/k=7	52 μs	422 μs
len=1M/k=7	6 ms	160 ms
len=1M/k=100	6 ms	2.2 s

Roll method is obviously proportional to the size of the window (O(k) — it has one roll to perform per row of output), when sliding_window_view is just a view, and does not really create rows, so is O(1) as far as k is concerned. Both method are equally impacted by len of data (O(n) really, but it shows only for n big enough).

So, all together, this method is O(n) while roll method is O(kn)