Home > Software design >  Multiplying multi-dimensional array by 1d array, to find nth product
Multiplying multi-dimensional array by 1d array, to find nth product

Time:03-10

I have a large 2d array list of matrices, for example matrices = np.random.rand(15, 10, 10)

Each of the 15 matrices have 10X10 states (A-J).Each of the matrices are in order and represent time in years in increments of 1. Starting from matrices[0] which contains the matrix values for year 1, up to matrices[14] year 15.

The table below shows my an example of my customer data, I have 12000 customers.

customer| current_state | year  | amount
ax111   |   A           |   3   |  300
ax112   |   D           |   4   |  4890
ax113   |   G           |   9   |  624

I basically need to match each customers year to the correct matrix and place their amount in their current_state creating a vector for each customer. Example:

ax111 = np.array([300,0,0,0,0,0,0,0,0,0]) (amount 300 placed at state A, 1st element)

ax112 = np([0,0,0,4890,0,0,0,0,0,0]) (amount 4890 placed at state D, 4th element)

I then need to multiply each customers array by the 2d array list matrices, based on the customers year, and continue multiplying the product by the next matrix until year 15, matrices[14] is reached for each customer.

The code below works for 1 customer, how can I run it for all 12000 customers.

matrices = np.random.rand(15, 10, 10)
ax111 = np.array([300,0,0,0,0,0,0,0,0,0])
output = ax111
results = []
for arr in matrices[3:14]:
    output = output@arr
    results.append(output)

The output for the code above will be a (15,10,10) array list. How can I efficiently apply this to 12000 customers?

CodePudding user response:

Since for each customer you perform the dot product of the corresponding array repeatedly with all matrices from i=year until i=14, you can precompute these accumulated matrices. I.e. instead of

output = (((ax @ matrices[year]) @ matrices[year 1]) @ ...)

you can do

output = ax @ (matrices[year] @ matrices[year 1]) @ ...)

and precompute the r.h.s.

Then you can perform a "pairwise matrix multiplication" (pairing each customer with the corresponding accumulated matrix) by performing a pairwise multiplication followed by a sum:

import itertools as it
import numpy as np


# --- Example data ---
rng = np.random.default_rng()
matrices = rng.integers(0, 100, size=(15,10,10))  # using integers for exact results
customers = np.array([
    [300,0,0,0,0,0,0,0,0,0],
    [0,0,0,4890,0,0,0,0,0,0],
    [0,0,0,0,0,0,624,0,0,0],
])
years = [3, 4, 9]


# --- Reference computation ---
results = []
for c, y in zip(customers, years):
    for m in matrices[y:]:
        c = c @ m
    results.append(c)
results = np.stack(results)


# --- Vectorized approach ---
matrices = np.stack([*it.accumulate(matrices[::-1], lambda x,y: y@x)][::-1])
new = (customers[:,:,None] * matrices[years]).sum(axis=1)


assert np.array_equal(new, results)

CodePudding user response:

**not an answer

@a_guest Im also only getting the 1st array for each customer, when I run your code, with a size of (3,10,10).When I run the for loop, with the same matrices data for ax111 I get all the arrays for ax111, with a size(11,10,10).

rng = np.random.default_rng()
matrices = rng.integers(0, 10, size=(15,10,10))  # using integers for exact results
ax111 = np.array([300,0,0,0,0,0,0,0,0,0])
newoutput = ax111
newresults = []
for arr in matrices[3:14]:
    newoutput = newoutput@arr
    newresults.append(newoutput)

When I run the above code I get all the arrays for ax111, with size(11,10,10).

customers = np.array([
    [300,0,0,0,0,0,0,0,0,0],
    [0,0,0,4890,0,0,0,0,0,0],
    [0,0,0,0,0,0,624,0,0,0],
])
years = [3, 4, 9]


# --- Reference computation ---
results = []
for c, y in zip(customers, years):
    for m in matrices[y:]:
        c = c @ m
    results.append(c)
results = np.stack(results)


# --- Vectorized approach ---
matrices = np.stack([*it.accumulate(matrices[::-1], lambda x,y: y@x)][::-1])
new = (customers[:,:,None] * matrices[years]).sum(axis=1)


assert np.array_equal(new, results)

When I run the above code I get an array of (3,10,10).When I compare results with newresults. I see that the above code answer only gives each customers first array. Let me know if Im doing something wrong

  • Related