Home > OS >  Concatenate x, y arrays keeping row index
Concatenate x, y arrays keeping row index

Time:01-27

I have one multiindex dataframe which contains the x and y coordinates of different body segments across time. It looks like this:

segment         0                         1  ...      98        99                
coords          k       x       y         k  ...       y         k       x       y
0        0.008525  312.05  361.65  0.011500  ...  329.97  0.012414  621.83  327.77
1        0.004090  312.32  359.98  0.007290  ...  329.00  0.034572  623.31  327.13
2        0.006645  313.42  359.11  0.011194  ...  330.53  0.003275  621.18  327.55
3        0.008367  314.79  361.47  0.013591  ...  329.58  0.026624  624.32  327.76
4        0.005160  315.91  364.54  0.009056  ...  329.97  0.026840  624.54  327.97
...           ...     ...     ...       ...  ...     ...       ...     ...     ...
40006   -0.081192  323.60  354.73 -0.070411  ...  431.78  0.088513  432.43  433.49
40007   -0.050125  319.29  357.99 -0.074568  ...  431.00  0.470994  436.47  432.65

The shape is 40008 rows and 300 columns. The k value I do not need.

For some plotting, however, I need my data to look like this:

[[index0, x_i0_s0, y_i0_s0],
[index0, x_i0_s1,y_i0_s1],
[index0, x_i0_s2,y_i0_s2],
...
[[index40007, x_i40007_s97, y_400i70_s97],
[index40007, x_i40007_s98,y_i40007_s98],
[index40007, x_i40007_s99,y_i40007_s99]]]

Or with real data:

[[0, 312.05, 361.65],
...
[4007, 436.47, 432.65]]

So basically I can get rid of the segment ID, but keep the index. The ouput array should have the following dimensions: (len(index)*segments, 3). In in this case being (4000800, 3).

Since I am not very good at manipulating multi-index dataframes I have tried to get the x and y coordinates separately by:

x = df.xs(('x',), level=('coords',), axis=1)
y = df.xs(('y',), level=('coords',), axis=1)

And after that I have tried different things like np.column_stack() and np.reshape() but without success. The furthest I have gone is with:

x = df.xs(('x',), level=('coords',), axis=1)
y = df.xs(('y',), level=('coords',), axis=1)
result = np.stack((x,y)), axis=2)

Which gives me an array of shape (40008, 100, 2), instead of (400800, 3)

Any help would be greatly appreciated, thank you!

CodePudding user response:

Try this:

# A smaller input dataframe to see if I understand your problem correctly
index = pd.MultiIndex.from_product(
    [range(5), list("kxy")], names=["segment", "corrds"]
)

df = pd.DataFrame(np.arange(10 * len(index)).reshape(-1, len(index)), columns=index)

# The manipulation
result = (
    df.rename_axis("index")
    .stack("segment")
    .reset_index()[["index", "x", "y"]]
    .to_numpy()
)
  • Related