I am trying to create a neural network using the Flux package, that takes as input an 100x3 matrix of random points, and outputs True or False. I have the corresponding labels in Y which is an 100 element array of Boolean Values (1 or 0)
This is my code so far.
# model
model = Chain(Dense(100, 32, relu), Dense(32, 1), sigmoid)
# loss function and the optimizer
loss(x, y) = Flux.binarycrossentropy(model(x), y)
opt = Flux.ADAM()
# Train
Flux.train!(loss, Flux.params(model), [(X, Y)], opt)
What I've noticed is that currently, if I call model(x)
it outputs a 3 element array with probabilities, which is not the thing I want, since it should output at least a 2 element array with the probability for it to be True or False. Also if I change my model to
model = Chain(Dense(100, 32, relu), Dense(32, 100), sigmoid)
it outputs a 100x3 matrix of probabilities, which is again, not correct as it should be 100x2 I believe.
CodePudding user response:
In Flux.jl, the observation dimension is is always the last one. I think in your problem, each row of the matrix is an observation. Am I right?
If that's the case, I think this will solve your problem:
using Flux
X = rand(3, 100) # each column has one observation
bools = rand(Bool, 100)
Y = hcat([[b, !b] for b in bools]...) # each column has one label
dataset = [(X, Y)]
model = Chain(
Dense(3, 32, relu),
Dense(32, 2),
sigmoid
)
opt = Flux.setup(Adam(), model)
loss(model, x, y) = Flux.binarycrossentropy(model(x), y)
Flux.train!(loss, model, dataset, opt)
Note: this code is using the latest version of Flux.jl