I want to rearrange my data (two even-length 1d arrays):
cs = [w x y z]
rs = [a b c d e f]
to make a result like this:
[[a b w x]
[c d w x]
[e f w x]
[a b y z]
[c d y z]
[e f y z]]
This is what I have tried (it works):
ls = []
for c in range(0,len(cs),2):
for r in range(0,len(rs),2):
item = [rs[r], rs[r 1], cs[c], cs[c 1]]
ls.append(item)
But I want to get the same result using reshaping/broadcasting or other numpy functions. What is the idiomatic way to do this task in numpy?
CodePudding user response:
You could tile the elements of rs
, repeat the elements of cs
and then arrange those as columns for a 2D array:
import numpy as np
cs = np.array(['w', 'x', 'y', 'z'])
rs = np.array(['a', 'b', 'c', 'd', 'e', 'f'])
res = np.c_[np.tile(rs[::2], len(cs) // 2), np.tile(rs[1::2], len(cs) // 2),
np.repeat(cs[::2], len(rs) // 2), np.repeat(cs[1::2], len(rs) // 2)]
Result:
array([['a', 'b', 'w', 'x'],
['c', 'd', 'w', 'x'],
['e', 'f', 'w', 'x'],
['a', 'b', 'y', 'z'],
['c', 'd', 'y', 'z'],
['e', 'f', 'y', 'z']], dtype='<U1')
An alternative:
np.c_[np.tile(rs.reshape(-1, 2), (len(cs) // 2, 1)),
np.repeat(cs.reshape(-1, 2), len(rs) // 2, axis=0)]
CodePudding user response:
An alternative to using tile/repeat
, is to generate repeated row indices.
Make the two arrays - reshaped as they will be combined:
In [106]: rs=np.reshape(list('abcdef'),(3,2))
In [107]: cs=np.reshape(list('wxyz'),(2,2))
In [108]: rs
Out[108]:
array([['a', 'b'],
['c', 'd'],
['e', 'f']], dtype='<U1')
In [109]: cs
Out[109]:
array([['w', 'x'],
['y', 'z']], dtype='<U1')
Make 'meshgrid' like indices (itertools.product
could also be used)
In [110]: IJ = np.indices((3,2))
In [111]: IJ
Out[111]:
array([[[0, 0],
[1, 1],
[2, 2]],
[[0, 1],
[0, 1],
[0, 1]]])
reshape
with order
gives two 1d arrays:
In [112]: I,J=IJ.reshape(2,6,order='F')
In [113]: I,J
Out[113]: (array([0, 1, 2, 0, 1, 2]), array([0, 0, 0, 1, 1, 1]))
Then just index the rs
and cs
and combine them with hstack
:
In [114]: np.hstack((rs[I],cs[J]))
Out[114]:
array([['a', 'b', 'w', 'x'],
['c', 'd', 'w', 'x'],
['e', 'f', 'w', 'x'],
['a', 'b', 'y', 'z'],
['c', 'd', 'y', 'z'],
['e', 'f', 'y', 'z']], dtype='<U1')
edit
Here's another way of looking this - a bit more advanced. With sliding_window_view we can get a "block" view of that Out[114]
result:
In [130]: np.lib.stride_tricks.sliding_window_view(_114,(3,2))[::3,::2,:,:]
Out[130]:
array([[[['a', 'b'],
['c', 'd'],
['e', 'f']],
[['w', 'x'],
['w', 'x'],
['w', 'x']]],
[[['a', 'b'],
['c', 'd'],
['e', 'f']],
[['y', 'z'],
['y', 'z'],
['y', 'z']]]], dtype='<U1')
With a bit more reverse engineering, I find I can create Out[114] with:
In [147]: res = np.zeros((6,4),'U1')
In [148]: res1 = np.lib.stride_tricks.sliding_window_view(res,(3,2),writeable=True)[::3,::2,:,:]
In [149]: res1[:,0,:,:] = rs
In [150]: res1[:,1,:,:] = cs[:,None,:]
In [151]: res
Out[151]:
array([['a', 'b', 'w', 'x'],
['c', 'd', 'w', 'x'],
['e', 'f', 'w', 'x'],
['a', 'b', 'y', 'z'],
['c', 'd', 'y', 'z'],
['e', 'f', 'y', 'z']], dtype='<U1')
I can't say that either of these is superior, but they show there are various ways of "vectorizing" this kind of array layout.