How to compute the forward pass outputs for these weights and biases with python-CodePudding

I am trying to do some dense neural network machine learning with python. I have a problem completing the code to compute the outputs from the weight and biases. When I apply operand * between the weight and the matrix element at a certain index I get the error that ValueError: operands could not be broadcast together with shapes (100,784) (1000,784,1). Am I applying bad indexing to the loop or what am I doing wrong please help.

##initialize the training and test arrays
train=np.empty((1000,28,28),dtype='float64')
trainY=np.zeros((1000,10,1))
test=np.empty((10000,28,28), dtype='float64')
testY=np.zeros((10000,10,1))

##reading image data into the training array
#load the images
i=0
for filename in os.listdir('Data/Training1000/'):
    y=int(filename[0])
    trainY[i,y]=1.0
    train[i]=cv2.imread('Data/Training1000/{0}'.format(filename),0)/255.0
    i =1

##reading image data into the testing array
i=0

for filename in os.listdir('Data/Test10000'):
    y = int(filename[0])
    testY[i,y] = 1.0
    test[i] = cv2.imread('Data/Test10000/{0}'.format(filename),0)/255.0 
    i=i 1

##reshape the training and testing arrays
trainX = train.reshape(train.shape[0],train.shape[1]*train.shape[2],1)
testX = test.reshape(test.shape[0],test.shape[1]*test.shape[2],1)
##section to declare the weights and the biases
w1 = np.random.uniform(low=-0.1,high=0.1,size=(numNeuronsLayer1,784))
b1 = np.random.uniform(low=-1,high=1,size=(numNeuronsLayer1,1))
w2 = np.random.uniform(low=-0.1,high=0.1,size=(numNeuronsLayer2,numNeuronsLayer1))
b2 = np.random.uniform(low=-0.1,high=0.1,size=(numNeuronsLayer2,1))

##declare the hidden layers
numNeuronsLayer1=100
numNeuronsLayer2=10
numEpochs=100

##declare a learning rate
learningRate = 0.1;

##do the forward pass on the weights and the biases
for n in range(0,numEpochs):
    loss=0
    trainX,trainY = shuffle(trainX, trainY)

    for i in range(trainX.shape[0]):
    ##this is where I have a problem, the line below throws the error described above
    ##my first pass is declared a2
    a2=w1*train[i] w2*trainX[i] b1

How do I correctly reference my training variables inside the loop above to get rid of the broadcast error, Thank You.

CodePudding user response：

You are very close, but with a couple of problems. First, you need to be doing matrix multiplication. * will do element-wise multiplication (i.e., np.array([1,2,3]) * np.array([2,3,4]) = np.array([2,6,12]). To do matrix multiplication with numpy you can use the @ operator (i.e., matrix1 @ matrix2) or use the np.matmul function.

You other problem is the shape of your inputs. I am not sure why you are adding a 3rd dimension (the 1 at the end of train.reshape(train.shape[0],train.shape[1]*train.shape[2],1). You should be fine keeping it as a matrix (change it to train.reshape(train.shape[0],train.shape[1]*train.shape[2]), change the test.reshape accordingly.

finally, your inference line is a little off: a2=w1*train[i] w2*trainX[i] b1

You first must calculate a1 before a2. An important part of matrix multiplication is that inner dimensions must agree (i.e., you cannot multiply matricies of shapes [100,50] and [100, 50] but you can multiply matricies of shapes [100,50] and [50, 60], the resulting shape of the matrix product is the outer indicies of each matrix, in this case [100,60]). As a result of matrix multiplication, you can also get rid of the for loop around training examples. All examples are calculated at the same time. So to calculate a1, we need to transpose our w1 and have it as the right hand variable.

a1 = ( trainX @ w1.transpose() )   b1.transpose()

then we can calcuate a2 as a function of a1

a2 = ( a1 @ w2.transpose() )   b2.transpose()