I'm trying to learn the basics of Tensorflow. So I figured a good first step would be to try and do a simple linear regression to ascertain the equation of a line from noisy data.
So, my first step is to generate the data:
import random;
import matplotlib.pyplot as pyplot
RAND_RANGE = range(-100,100);
X_RANGE = range(-10,10);
NOISINESS = 0.5;
NOISE_RANGE = range(int(RAND_RANGE.start * NOISINESS), int(RAND_RANGE.stop * NOISINESS));
m = random.randrange(RAND_RANGE.start, RAND_RANGE.stop); # Python devs... Why doesn't a function that gets something IN A RANGE have an overload that takes a range... Huh?!
c = random.randrange(RAND_RANGE.start, RAND_RANGE.stop);
n = [random.randrange(NOISE_RANGE.start, NOISE_RANGE.stop) for _ in X_RANGE];
real_x = [x for x in X_RANGE];
real_y = [m * x c for x in real_x];
noisy_y = [m * x c n for x, n in zip(real_x, n)];
Which seems to work quite nicely (plotting code not included for brevity):
So then I'm trying to load this into a basic tensorflow model like this:
import tensorflow as tf;
import numpy as np;
from tensorflow import keras;
from tensorflow.keras import layers;
features = np.array(real_x);
labels = np.array(real_y);
line_model = tf.keras.Sequential(
[
layers.Dense(units=1)
]
)
line_model.build(input_shape=[len(real_y)]);
line_model.summary();
line_model.compile(
optimizer = tf.keras.optimizers.Adam(learning_rate=0.1), loss='mean_absolute_error'
)
history = line_model.fit(
features,
labels,
epochs=100,
verbose=0
)
However, it's here that I'm running into issues. Every tutorial that I've found just seems to load in data from a CSV file using numpy and then just blindly throws it at Tensorflow without really explaining in what form it's expecting the data.
It's very possible I'm misunderstanding something here, but as far as I can tell, my real_x
values are my features and the real_y
values are the labels.
So I've tried the following:
- Putting the XY values into separate numpy arrays
- Putting the XY values into a single, 2D numpy array
- Various values of
input_shape
(I think it should just be a single value with the number of items in it but that gives me an error about dense needing a minimum number of 2 dimensions)
I've had various different errors whilst trying these different things.
I feel like I'm missing something fundamental here and that this shouldn't be too complicated, but I'm reluctant to just copy code wholesale from a tutorial that I don't understand (hence why I'm trying to do my own exercise that's adjacent, but different to what they're doing).
What am I doing wrong here, how should I be loading this data into the model in such a way that will let me predict m
and c
(or a series of denoised y
values I can then calculate m
and c
from as an extra step) given real_x
and noisy_y
?
CodePudding user response:
When specifying an input shape for the model, you need to consider the number of features for each target, so this is not correct:
line_model.build(input_shape=[len(real_y)]);
Using
input_shape
parameter intf.keras.layers.Layer
instance:line_model = tf.keras.Sequential( [layers.Dense(units=1, input_shape=(1, ))] )
Using
input_dim
:line_model = tf.keras.Sequential( [layers.Dense(units=1, input_dim=1)] )
input_shape = (1, )
because you have one feature per target, input_dim
also follows the same logic.
Using
delayed-build
pattern/method which is a little bit different:line_model.build(input_shape=[None, 1])
When using .build()
method, you need to provide batch size to the model, here None
indicates any batch size is accepted by the model. This is called batch_input_shape
.
Say, you have 100 samples, and pass 40 as the batch size. Last batch will have 20 elements (40-40-20), so model can accept last 20 elements if it is passed None
, otherwise will result in error.
If you use input_shape
or input_dim
in the first layer, you do not need to worry about specifying batch_size
.