I have 2 numpy arrays:
1st array Contains a matrix of 3 elements : Class name, ID, coordinates
['Mobile Phone', '000ad20b5e452b24','0.0196.800512617.6939.200512']
2nd array contains a matrix of 3 elements: Class name, ID, image array:
['Mobile Phone', '000ad20b5e452b24',
array([[[183, 205, 210],
[181, 203, 208],
[186, 206, 211],
...,
[202, 216, 222],
[201, 213, 219],
[202, 214, 220]],
[[178, 200, 205],
[177, 199, 204],
[179, 199, 204],
...,
[186, 200, 206],
[189, 201, 207],
[194, 206, 212]],
[[174, 196, 201],
[173, 195, 200],
[174, 193, 200],
...,
[170, 184, 190],
[172, 184, 190],
[177, 189, 195]],
...,
[[217, 226, 235],
[216, 225, 234],
[213, 222, 231],
...,
[ 88, 97, 110],
[ 96, 105, 118],
[100, 109, 122]],
[[202, 209, 218],
[193, 200, 209],
[181, 190, 199],
...,
[124, 128, 139],
[134, 138, 149],
[139, 143, 154]],
[[183, 190, 199],
[168, 175, 184],
[152, 161, 170],
...,
[147, 149, 159],
[160, 162, 173],
[167, 169, 180]]]
The first array could have duplicate IDs, but the second one doesnt.
For each row in the first array i want to check if the second array has the same id and class name and append or get the image array.
CodePudding user response:
Here is a simplified example, with a 1d array. Only one of of the classes will match, the other example is not found in the image array. The inner join will capture the desired match, and you can see by inspecting the data type of the image data that it is still an ndarray.
import numpy as np
classes = np.array([['Mobile Phone', '000ad20b5e452b24','0.0196.800512617.6939.200512'],
['Mobile Phone', '000ad20b5e99999','0.0196.800512617.6939.200512']])
image_data = np.array([['Mobile Phone', '000ad20b5e452b24', np.array([183, 205, 210])],
['Mobile Phone', '000ad20b5e444444', np.array([183, 205, 210])]])
c = pd.DataFrame(classes, columns=['class','id','coordinates'])
i = pd.DataFrame(image_data, columns=['class','id','image'])
output = c.merge(i, on=['class','id'], how='inner')
print(output)
print(type(output['image'].iloc[0]))
Output
class id coordinates image
0 Mobile Phone 000ad20b5e452b24 0.0196.800512617.6939.200512 [183, 205, 210]
<class 'numpy.ndarray'>