Home > Software design >  The most efficient way to sum all possible pairs (x_ik, y_j) for a given k?
The most efficient way to sum all possible pairs (x_ik, y_j) for a given k?

Time:12-02

I have two numpy array x with shape (n,m) and y with shape (p,). I would like to sum all possible pairs x[k, i] and y[j] to create a new numpy array z with shape (n, m*p).

A naïve algorithm would be :

import numpy as np
# some code
z = np.empty((n, m*p))
for k in range(n):
    for i in range(m):
        for j in range(p):
            z[k, i   m * j] = x[k, i]   y[j]

This algorithm has a polynomial complexity : O(n*m*p) Knowing I am working on array with $n ~ 1e6$ I am looking a for a more efficient algorithm using the power of numpy and/or pandas.

I have done some research and I found a possible solution : Efficient way to sum all possible pairs

But it does not fit with my specific problem, I mean I can use it but it will still not be pythonic as I would iterate with one loop (the solution is reusable without much effort for n=1).

CodePudding user response:

As others have said in the comments, not improving on the complexity but making use of vectorization and memory contiguity:

np.add.outer(X,y).reshape(len(X), -1)
  • Related