Home > Mobile >  How to get median values across diagonal lines in a matrix?
How to get median values across diagonal lines in a matrix?

Time:10-06

I have the following matrix in pandas:

import numpy as np
import pandas as pd

df_matrix = pd.DataFrame(np.random.random((10, 10)))

I need to get a vector that contains 10 median values, 1 value across each blue line as shown in the picture below:

enter image description here

The last number in the output vector is basically 1 number rather than a median.

CodePudding user response:

X = np.random.random((10, 10))
fX = np.fliplr(X) # to get the "other" diagonal
np.array([np.median(np.diag(fX, k=-k)) for k in range(X.shape[0])])

CodePudding user response:

The diagonals are such that row_num col_num = constant. So you can use stack and sum the rows/cols and groupby:

(df_matrix.stack().reset_index(name='val')
   .assign(diag=lambda x: x.level_0 x.level_1)  # enumerate the diagonals
   .groupby('diag')['val'].median()             # median by diagonal
   .loc[len(df_matrix):]                        # lower triangle diagonals
)

Output (for np.random.seed(42)):

diag
9     0.473090
10    0.330898
11    0.531382
12    0.440152
13    0.548075
14    0.325330
15    0.580145
16    0.427541
17    0.248817
18    0.107891
Name: val, dtype: float64
  • Related