I have a numpy
structured array:
import numpy as np
arr1 = (np.array([32, 32, 32, 32, 32, 39, 21], dtype=np.int64),np.array([449, 451, 452, 453, 454, 463, 340], dtype=np.int64))
arr1_x = arr1[0]
arr1_y = arr1[1]
arr1_struct = np.empty(arr1_x.shape[0], dtype=[('x', int), ('y', int)])
arr1_struct["x"] = arr1_x
arr1_struct["y"] = arr1_y
I would need to determine the groups of y values that correspond to unique x values.
Using the above example, I would need to know that:
- Where x=32, y=449, 451, 452, 453, 454
- Where x=39, y=463
- Where x=21, y=340
I was looking into np.unique
but that didn't give exactly what I'm after.
Any output format would be fine that provides this information. An example would be
x_unique = [32,39,21]
y_unique = [ [449,451,452,453,454] , [463] , [340] ]
Where the X value at index 0 in the x_unique
array/list has the Y values at index 0 in the y_unique
array/list, and so on so forth.
Is there an easy and efficient way to do this in numpy
?
CodePudding user response:
Sounds like you need to be using pandas
and work with a dataframe-like structure.
df = pd.DataFrame([*arr1], index=['x', 'y']).T
which gives
x y
0 32 449
1 32 451
2 32 452
3 32 453
4 32 454
5 39 463
6 21 340
Then,
>>> df.groupby('x')['y'].unique().to_dict()
{21: array([340]), 32: array([449, 451, 452, 453, 454]), 39: array([463])}
Without pandas, you can refer to this thread for analogous ways to perform a groupby
with numpy or pure python.