I am currently building NeuralNetwork in python only using numpy.
This is the layout of the problem area:
I have one array holding the values for the input neurons in the columns and the rows represent the different training data points. It is of shape 3, 3:
in_a
array([['t_1a_1', 't_1a_2', 't_1a_3'],
['t_2a_1', 't_2a_2', 't_2a_3'],
['t_3a_1', 't_3a_2', 't_3a_3']], dtype='<U6')
Then I have an array for the weights, in where the columns are the connections going to the output1, 2 and 3 and the rows are the connections starting from 1, 2 and 3. It also has the shape 3, 3:
in_w
array([['w_11', 'w_12', 'w_13'],
['w_21', 'w_22', 'w_23'],
['w_31', 'w_32', 'w_33']], dtype='<U4')
Now I want to compute a matrix out with shape 3, 3, 3. That looks as follows:
out
array([[['t_1*a_1*w_11', 't_1*a_1*w_12', 't_1*a_1*w_13'],
['t_1*a_2*w_21', 't_1*a_2*w_22', 't_1*a_2*w_23'],
['t_1*a_3*w_31', 't_1*a_2*w_32', 't_1*a_2*w_33']],
[['t_2*a_1*w_11', 't_2*a_1*w_12', 't_2*a_1*w_13'],
['t_2*a_2*w_21', 't_2*a_2*w_22', 't_2*a_2*w_23'],
['t_2*a_3*w_31', 't_2*a_2*w_32', 't_2*a_2*w_33']],
[['t_3*a_1*w_11', 't_3*a_1*w_12', 't_3*a_1*w_13'],
['t_3*a_2*w_21', 't_3*a_2*w_22', 't_3*a_2*w_23'],
['t_3*a_3*w_31', 't_3*a_2*w_32', 't_3*a_3*w_33']]], dtype='<U12')
I tried numpy.dot, simple * multiplication, @ combination but nothing worked. I think a solution might be numpy.einsum or numpy.tensordot but I could not wrap my head around them. Does anybody know how to compute the out matrix based on the in matrices or can recommend a method and explanation? Thanks for you help
CodePudding user response:
All you need is
in_a[...,None] * in_w
If you think about this in_a
has shape (training_sets, input_neurons)
and in_w
(input_neurons, output_neurons)
. And your output seems to be an element-wise multiplication of
(T, I) # in_a
* (I, O) # in_w
Let's demonstrate this for fun
class Variable:
def __init__(self, name):
self.name = name
def __mul__(self, other):
if not isinstance(other, Variable):
raise ValueError
return Variable(f'{self.name}*{other.name}')
def __repr__(self):
return self.name
def generate_array(fmt, rows, columns):
return np.array([[Variable(fmt.format(i, j)) for j in range(1, columns 1)]
for i in range(1, rows 1)])
in_a = generate_array('t_{}a_{}', 3, 3)
in_w = generate_array('ww_{}{}', 3, 3)
print(in_a[...,None] * in_w)
Which prints
[[[t_1a_1*ww_11 t_1a_1*ww_12 t_1a_1*ww_13]
[t_1a_2*ww_21 t_1a_2*ww_22 t_1a_2*ww_23]
[t_1a_3*ww_31 t_1a_3*ww_32 t_1a_3*ww_33]]
[[t_2a_1*ww_11 t_2a_1*ww_12 t_2a_1*ww_13]
[t_2a_2*ww_21 t_2a_2*ww_22 t_2a_2*ww_23]
[t_2a_3*ww_31 t_2a_3*ww_32 t_2a_3*ww_33]]
[[t_3a_1*ww_11 t_3a_1*ww_12 t_3a_1*ww_13]
[t_3a_2*ww_21 t_3a_2*ww_22 t_3a_2*ww_23]
[t_3a_3*ww_31 t_3a_3*ww_32 t_3a_3*ww_33]]]