Based on the comments, I have revised the example:
Consider the following code
import numpy as np
def subspace_angle(A, B):
M = A.T @ B
s = np.linalg.svd(M, compute_uv=False)
return s[0]
def principal_angles(bases):
k = bases.shape[0]
r = np.zeros((k, k))
for i in range(k):
x = bases[i]
r[i, i] = subspace_angle(x, x)
for j in range(i):
y = bases[j]
r[i, j] = subspace_angle(x, y)
r[j, i] = r[i, j]
r = np.minimum(1, r)
return np.rad2deg(np.arccos(r))
Following is an example use:
bases = []
# number of subspaces
k = 5
# ambient dimension
n = 8
# subspace dimension
m = 4
for i in range(5):
X = np.random.randn(n, m)
Q,R = np.linalg.qr(X)
bases.append(Q)
# combine the orthonormal bases for all the subspaces
bases = np.array(bases)
# Compute the smallest principal angles between each pair of subspaces.
print(np.round(principal_angles(bases), 2))
Is there a way to avoid the two-level for loops in the principal_angles
function, so that the code could be sped up?
As a result of this code, the matrix r is symmetric. Since subspace_angle
could be compute-heavy depending on the array size, it is important to avoid computing it twice for r[i,j]
and r[j,i]
.
On the comment about JIT, actually, I am writing the code with Google/JAX. The two-level loop does get JIT compiled giving performance benefits. However, the JIT compilation time is pretty high (possibly due to two levels of for-loop). I am wondering if there is a better way to write this code so that it may compile faster.
CodePudding user response:
I started to copy your code to ipython
session, getting a (5,8,4) shaped bases
. But then realized that func
is undefined. So by commenting that out, I get:
In [6]: def principal_angles(bases):
...: k = bases.shape[0]
...: r = np.zeros((k, k))
...: for i in range(k):
...: x = bases[i]
...: # r[i, i] = func(x, x)
...: for j in range(i):
...: y = bases[j]
...: r[i, j] = subspace_angle(x, y)
...: #r[j, i] = r[i, j]
...: return r
...: #r = np.minimum(1, r)
...: #return np.rad2deg(np.arccos(r))
...:
In [7]: r=principal_angles(bases)
In [8]: r.shape
Out[8]: (5, 5)
Since both matmul
and svd
can work with higher dimensions, i.e. batches, I wonder if it's possible to call subspace_angle
with all bases
, rather than iteratively.
We have to think carefully about what shapes we pass it, and how they evolve.
def subspace_angle(A, B):
M = A.T @ B
s = np.linalg.svd(M, compute_uv=False)
return s[0]
(Oops, my os just crashed the terminal, so I'll have get back to this later.)
So A
and B
are (8,4), A.T
is (4,8), and A.T@B
is (4,4)
If they were (5,8,4), A.transpose(0,2,1)
would be (5,4,8), and M
would be (5,4,4).
I believe np.linalg.svd
accepts that M
, returning a (5,4,4)
In [29]: r=principal_angles(bases)
In [30]: r
Out[30]:
array([[0. , 0. , 0. , 0. , 0. ],
[0.99902153, 0. , 0. , 0. , 0. ],
[0.99734371, 0.95318936, 0. , 0. , 0. ],
[0.99894054, 0.99790422, 0.87577343, 0. , 0. ],
[0.99840093, 0.92809283, 0.99896121, 0.98286429, 0. ]])
Let's try that with the whole bases. Use broadcasting to get the 'outer' product on the first dimension:
In [31]: M=bases[:,None,:,:].transpose(0,1,3,2)@bases
In [32]: r1=np.linalg.svd(M, compute_uv=False)
In [33]: M.shape
Out[33]: (5, 5, 4, 4)
In [34]: r1.shape
Out[34]: (5, 5, 4)
To match your s[0]
I have to use (need to review the svd
docs):
In [35]: r1[:,:,0]
Out[35]:
array([[1. , 0.99902153, 0.99734371, 0.99894054, 0.99840093],
[0.99902153, 1. , 0.95318936, 0.99790422, 0.92809283],
[0.99734371, 0.95318936, 1. , 0.87577343, 0.99896121],
[0.99894054, 0.99790422, 0.87577343, 1. , 0.98286429],
[0.99840093, 0.92809283, 0.99896121, 0.98286429, 1. ]])
time savings aren't massive, but may be better if the first dimension is larger than 5:
In [36]: timeit r=principal_angles(bases)
320 µs ± 554 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [37]: %%timeit
...: M=bases[:,None,:,:].transpose(0,1,3,2)@bases
...: r1=np.linalg.svd(M, compute_uv=False)[:,:,0]
...:
...:
190 µs ± 450 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
This may be enough to get you started with a more refined "vectorization".
CodePudding user response:
After some more thinking and experimenting with np.triu_indices
function, I have come up with the following solution which avoids extra unnecessary computation.
def vectorized_principal_angles(subspaces):
# number of subspaces
n = subspaces.shape[0]
# Indices for upper triangular matrix
i, j = np.triu_indices(n, k=1)
# prepare all the possible pairs of A and B
A = subspaces[i]
B = subspaces[j]
# Compute the Hermitian transpose of each matrix in A array
AH = np.conjugate(np.transpose(A, axes=(0,2,1)))
# Compute M = A^H B for each matrix pair
M = np.matmul(AH, B)
# Compute the SVD for each matrix in M
s = np.linalg.svd(M, compute_uv=False)
# keep only the first singular value for each M
s = s[:, 0]
# prepare the result matrix
# It is known in advance that diagonal elements will be 1.
r = 0.5 * np.eye(n)
r[i, j] = s
# Symmetrize the matrix
r = r r.T
# Final result
return r
Here is what is going on:
np.triu_indices(k, k=1)
gives me indices for n(n-1)/2 pairs of possible combinations of matrices.- All the remaining computation is limited to the n(n-1)/2 pairs only.
- Finally, the scalar values array is put back into a square symmetric result matrix
Thank you @hpaulj for your solution. It helped me a lot in getting the right direction.