Let's say I have 2 arrays:
a = np.array([[[1],[1]],[[2],[2]],[[3],[3]],[[4],[4]]])
b = np.array(["a", "b", "c", "d"]).reshape(4,1,1)
Then a.shape = (4,2,1)
and b.shape=(4,1,1)
. My desired output would look like this
c = np.array([
[
np.array([[1],[1]]), "a"
],
[
np.array([[2],[2]]), "b"
],
[
np.array([[3],[3]]), "c"
],
[
np.array([[4],[4]]), "d"
],
])
I tried np.hstack
an np.concatenate
, but that doesn't quite do what I want. I realize I can simply loop through a
and b
and create the array, I am simply wondering if there is a specific function, which would return array c
or if loop is my best bet here.
CodePudding user response:
you can iterate in conjuntion with zip and a one-liner
a = np.array([[[1],[1]],[[2],[2]],[[3],[3]],[[4],[4]]])
b = np.array(["a", "b", "c", "d"])
c = np.array([[aa,bb] for aa,bb in zip(a,b)])
CodePudding user response:
The closest I can get is using np.column_stack([a, b])
. However, this will include the string characters in the numpy array, instead of outside as you have.
>>> np.column_stack([a,b])
array([[['1'],
['1'],
['a']],
[['2'],
['2'],
['b']],
[['3'],
['3'],
['c']],
[['4'],
['4'],
['d']]], dtype='<U11')
CodePudding user response:
First remove the unnecessary dimensions from b
:
In [203]: b1 = b.ravel()
Second make a 4 element object dtype array from a
:
In [204]: a1 = np.empty(4, object)
In [205]: a1[:] = list(a)
In [206]: a1
Out[206]:
array([array([[1],
[1]]), array([[2],
[2]]), array([[3],
[3]]), array([[4],
[4]])],
dtype=object)
Now we can stack
them on a new 2nd axis to make a (4,2) array:
In [207]: np.stack((a1,b1), axis=1)
Out[207]:
array([[array([[1],
[1]]), 'a'],
[array([[2],
[2]]), 'b'],
[array([[3],
[3]]), 'c'],
[array([[4],
[4]]), 'd']], dtype=object)
The list comprehension is simpler, and probably faster:
In [209]: [[i,j.item()] for i,j in zip(a,b)]
Out[209]:
[[array([[1],
[1]]),
'a'],
[array([[2],
[2]]),
'b'],
[array([[3],
[3]]),
'c'],
[array([[4],
[4]]),
'd']]
Or make a structured array with 2 fields:
In [215]: c = np.zeros(4, dtype='O,U1')
In [216]: c
Out[216]:
array([(0, ''), (0, ''), (0, ''), (0, '')],
dtype=[('f0', 'O'), ('f1', '<U1')])
In [218]: c['f0'] = list(a)
In [219]: c['f1'] = b.ravel()
In [220]: c
Out[220]:
array([(array([[1],
[1]]), 'a'), (array([[2],
[2]]), 'b'), (array([[3],
[3]]), 'c'),
(array([[4],
[4]]), 'd')], dtype=[('f0', 'O'), ('f1', '<U1')])
or
In [221]: c = np.zeros(4, dtype=[('a',int,(2,1)),('b','U1')])
In [222]: c['a']=a
In [224]: c['b']=b.ravel()
In [225]: c
Out[225]:
array([([[1], [1]], 'a'), ([[2], [2]], 'b'), ([[3], [3]], 'c'),
([[4], [4]], 'd')], dtype=[('a', '<i8', (2, 1)), ('b', '<U1')])
The key to all these methods is understanding what you are trying to create. You aren't making a "normal" multidimensional array. You are trying to mix arrays and strings.
You can concatenate
to make a (4,3,1) - and preserve ints by specifying the dtype. This joins a (4,2,1) with a (4,1,1) on the middle dimension. That's what concatenate (and the stack
derivatives) does best.
In [263]: np.concatenate((a,b),axis=1, dtype=object)
Out[263]:
array([[[1],
[1],
['a']],
[[2],
[2],
['b']],
[[3],
[3],
['c']],
[[4],
[4],
['d']]], dtype=object)