I'm trying to use Octave to submit an assignment written in MATLAB.h_theta2
matrix is a 5000*10 matrix in MATLAB (please see the attached screenshot) and the code works fine in MATLAB. But when I try to submit the assignment in Octave it returns the following error:
Submission failed: operator -: nonconformant arguments (op1 is 16x4, op2 is 5000x10)
LineNumber: 98 (Which refers to delta3=h_theta2-y_2
in the attached screenshot.)
This (I'm guessing) means that Octave is treating h_theta2
as a 16*4 matrix.
The code is supposed to estimate the cost function and gradient of a neural network. X, y, Theta1 and Theta2 are given in the assignment.
function [J grad] = nnCostFunction(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
NNCOSTFUNCTION Implements the neural network cost function for a two-layer neural network which performs classification.
[J grad] = NNCOSTFUNCTON(nn_params, hidden_layer_size, num_labels, ..., X, y, lambda)
computes the cost and gradient of the neural network. The parameters for the neural network are "unrolled" into the vector nn_params and need to be converted back into the weight matrices.
The returned parameter grad should be an "unrolled" vector of the partial derivatives of the neural network.
Reshape nn_params
back into the parameters Theta1
and Theta2
, the weight matrices. For 2-layer neural network:
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size 1)), ...
hidden_layer_size, (input_layer_size 1));
Theta2 = reshape(nn_params((1 (hidden_layer_size * (input_layer_size 1))):end), ...
num_labels, (hidden_layer_size 1));
m = size(X, 1);
I need to return the following variables correctly:
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
Sigmoid function is defined in another file and is recalled here to calculate h_theta1
and h_theta2
.
%Sigmoid function:
function g = sigmoid(z)
%SIGMOID Compute sigmoid function
% J = SIGMOID(z) computes the sigmoid of z.
g = 1.0 ./ (1.0 exp(-z));
end
Feedforward the neural network and return the cost in the variable J:
X = [ones(m, 1) X];
h_theta1=sigmoid(X*Theta1');
h_theta1=[ones(m,1) h_theta1];
h_theta2=sigmoid(h_theta1*Theta2');
y_2=zeros(5000,10);
for k=1:10
condition=y(:,1)==k;
y_2(condition,k)=1;
end
for i=1:m
for k=1:num_labels
e(i,k)=-y_2(i,k)'*log(h_theta2(i,k))-(1-y_2(i,k)')*log(1-h_theta2(i,k));
end
end
J=(1/m)*sum(e);
J=sum(J);
Theta_1=Theta1;
Theta_2=Theta2;
Theta_1(:,1)=[];
Theta_2(:,1)=[];
%Regularized cost function:
J=J (lambda/(2*m))*(sum(sum(Theta_1.*Theta_1)) sum(sum(Theta_2.*Theta_2)));
%Gradient calculation
delta3=h_theta2-y_2;
delta2=(delta3*Theta2).*h_theta1.*(1-h_theta1);
Theta2_grad=Theta2_grad delta3'*h_theta1;
Theta2_grad=(1/m)*Theta2_grad;
delta_2=delta2;
delta_2(:,1)=[];
Theta1_grad=Theta1_grad delta_2'*X;
Theta1_grad=(1/m)*Theta1_grad;
I then submit the above code using a submit()
function in Octave. The code works for J
calculation but then gives the following error:
octave:80> submit()
== Submitting solutions | Neural Networks Learning...
Use token from last successful submission? (Y/n): Y
!! Submission failed: operator -: nonconformant arguments
(op1 is 16x4, op2 is 5000x10)
Function: nnCostFunction
LineNumber: 98
Please correct your code and resubmit.
Any help would be much appreciated.
CodePudding user response:
I figured out where the problem was. The thing is the grader tests my answer with a totally different dataset and I had created y_2
with fixed dimensions. What I should've done instead was to create y_2
as follows:
y_2=zeros(m,num_labels);
for k=1:num_labels
condition=y(:,1)==k;
y_2(condition,k)=1;
end
Which makes the code work for any value of m and num_labels.