How to convert this nested loop to numpy broadcast?-CodePudding

I want to rearrange my data (two even-length 1d arrays):

cs = [w x y z]
rs = [a b c d e f]

to make a result like this:

[[a b w x]
 [c d w x]
 [e f w x]
 [a b y z]
 [c d y z]
 [e f y z]]

This is what I have tried (it works):

ls = []
for c in range(0,len(cs),2):
  for r in range(0,len(rs),2):
    item = [rs[r], rs[r 1], cs[c], cs[c 1]]
    ls.append(item)

But I want to get the same result using reshaping/broadcasting or other numpy functions. What is the idiomatic way to do this task in numpy?

CodePudding user response：

You could tile the elements of rs, repeat the elements of cs and then arrange those as columns for a 2D array:

import numpy as np

cs = np.array(['w', 'x', 'y', 'z'])
rs = np.array(['a', 'b', 'c', 'd', 'e', 'f'])
res = np.c_[np.tile(rs[::2], len(cs) // 2), np.tile(rs[1::2], len(cs) // 2),
            np.repeat(cs[::2], len(rs) // 2), np.repeat(cs[1::2], len(rs) // 2)]

Result:

array([['a', 'b', 'w', 'x'],
       ['c', 'd', 'w', 'x'],
       ['e', 'f', 'w', 'x'],
       ['a', 'b', 'y', 'z'],
       ['c', 'd', 'y', 'z'],
       ['e', 'f', 'y', 'z']], dtype='<U1')

An alternative:

np.c_[np.tile(rs.reshape(-1, 2), (len(cs) // 2, 1)),
      np.repeat(cs.reshape(-1, 2), len(rs) // 2, axis=0)]

CodePudding user response：

An alternative to using tile/repeat, is to generate repeated row indices.

Make the two arrays - reshaped as they will be combined:

In [106]: rs=np.reshape(list('abcdef'),(3,2))
In [107]: cs=np.reshape(list('wxyz'),(2,2))
In [108]: rs
Out[108]: 
array([['a', 'b'],
       ['c', 'd'],
       ['e', 'f']], dtype='<U1')
In [109]: cs
Out[109]: 
array([['w', 'x'],
       ['y', 'z']], dtype='<U1')

Make 'meshgrid' like indices (itertools.product could also be used)

In [110]: IJ = np.indices((3,2))
In [111]: IJ
Out[111]: 
array([[[0, 0],
        [1, 1],
        [2, 2]],

       [[0, 1],
        [0, 1],
        [0, 1]]])

reshape with order gives two 1d arrays:

In [112]: I,J=IJ.reshape(2,6,order='F')
In [113]: I,J
Out[113]: (array([0, 1, 2, 0, 1, 2]), array([0, 0, 0, 1, 1, 1]))

Then just index the rs and cs and combine them with hstack:

In [114]: np.hstack((rs[I],cs[J]))
Out[114]: 
array([['a', 'b', 'w', 'x'],
       ['c', 'd', 'w', 'x'],
       ['e', 'f', 'w', 'x'],
       ['a', 'b', 'y', 'z'],
       ['c', 'd', 'y', 'z'],
       ['e', 'f', 'y', 'z']], dtype='<U1')

edit

Here's another way of looking this - a bit more advanced. With sliding_window_view we can get a "block" view of that Out[114] result:

In [130]: np.lib.stride_tricks.sliding_window_view(_114,(3,2))[::3,::2,:,:]
Out[130]: 
array([[[['a', 'b'],
         ['c', 'd'],
         ['e', 'f']],

        [['w', 'x'],
         ['w', 'x'],
         ['w', 'x']]],


       [[['a', 'b'],
         ['c', 'd'],
         ['e', 'f']],

        [['y', 'z'],
         ['y', 'z'],
         ['y', 'z']]]], dtype='<U1')

With a bit more reverse engineering, I find I can create Out[114] with:

In [147]: res = np.zeros((6,4),'U1')
In [148]: res1 = np.lib.stride_tricks.sliding_window_view(res,(3,2),writeable=True)[::3,::2,:,:]
In [149]: res1[:,0,:,:] = rs
In [150]: res1[:,1,:,:] = cs[:,None,:]
In [151]: res
Out[151]: 
array([['a', 'b', 'w', 'x'],
       ['c', 'd', 'w', 'x'],
       ['e', 'f', 'w', 'x'],
       ['a', 'b', 'y', 'z'],
       ['c', 'd', 'y', 'z'],
       ['e', 'f', 'y', 'z']], dtype='<U1')

I can't say that either of these is superior, but they show there are various ways of "vectorizing" this kind of array layout.