Home > database >  Find groups of Y values for unique X values, numpy structured array?
Find groups of Y values for unique X values, numpy structured array?

Time:06-10

I have a numpy structured array:

import numpy as np
arr1 = (np.array([32, 32, 32, 32, 32, 39, 21], dtype=np.int64),np.array([449, 451, 452, 453, 454, 463, 340], dtype=np.int64))

arr1_x = arr1[0]
arr1_y = arr1[1]

arr1_struct = np.empty(arr1_x.shape[0], dtype=[('x', int), ('y', int)])

arr1_struct["x"] = arr1_x
arr1_struct["y"] = arr1_y

I would need to determine the groups of y values that correspond to unique x values.

Using the above example, I would need to know that:

  • Where x=32, y=449, 451, 452, 453, 454
  • Where x=39, y=463
  • Where x=21, y=340

I was looking into np.unique but that didn't give exactly what I'm after.

Any output format would be fine that provides this information. An example would be

  • x_unique = [32,39,21]
  • y_unique = [ [449,451,452,453,454] , [463] , [340] ]

Where the X value at index 0 in the x_unique array/list has the Y values at index 0 in the y_unique array/list, and so on so forth.

Is there an easy and efficient way to do this in numpy?

CodePudding user response:

Sounds like you need to be using pandas and work with a dataframe-like structure.

df = pd.DataFrame([*arr1], index=['x', 'y']).T

which gives

    x    y
0  32  449
1  32  451
2  32  452
3  32  453
4  32  454
5  39  463
6  21  340

Then,

>>> df.groupby('x')['y'].unique().to_dict()
{21: array([340]), 32: array([449, 451, 452, 453, 454]), 39: array([463])}

Without pandas, you can refer to this thread for analogous ways to perform a groupby with numpy or pure python.

  • Related