Simple neural network for a linear regression problem (one input, one output) behaves the exact oppo-CodePudding

I try to solve a simple linear regression problem with a neural network (preferably in TensorFlow) where I have the data of the population (which is decreasing continuously) for 17 years.

When I make predictions there are 2 problems:

The predicted population is far away from the last one (the last 'loss' is 642388.5000).
The population is increasing as the years pass when it should clearly decrease.

What can I do to reduce the error? And why the model predicts that the population is increasing over time when it's clearly not?

The code I wrote:

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras.layers import Input, Dense
from tensorflow.python.keras.models import Model

X = np.arange(2002, 2019)
y = np.linspace(21730496, 19473970, 17).astype(float)

input1 = Input(shape=(1,))
l1 = Dense(10, activation='relu')(input1)
l2 = Dense(50, activation='relu')(l1)
l3 = Dense(50, activation='relu')(l2)
out = Dense(1)(l3)

model = Model(inputs=input1, outputs=[out])
model.compile(
    optimizer='adam',
    loss=['mae']
)

history = model.fit(X, y, epochs=500, batch_size=10)

model.predict([2019., 2020., 2021.])

CodePudding user response：

There are a few things you can try to reduce the error in your model's predictions:

Increase the number of data points: Since you only have 17 data points, you may not have enough data for the model to accurately learn the underlying trends and patterns in the data. Increasing the number of data points may help the model to better capture the underlying trends and improve its predictions.

Normalize the input data: Normalizing the input data to have a mean of 0 and a standard deviation of 1 may help the model to converge faster and improve its performance.

Increase the complexity of the model: Increasing the number of layers or units in the model may allow it to capture more complex patterns in the data and improve its predictions.

Use a different optimization algorithm: Different optimization algorithms may work better on different types of data. You could try using a different optimization algorithm, such as stochastic gradient descent or RMSprop, to see if it improves the model's performance.

As for the issue of the model predicting an increasing population, there could be a few reasons for this:

The model is overfitting: If the model is overly complex or has too few data points, it may fit the training data too closely and struggle to generalize to new data. This can cause the model to make poor predictions on unseen data.

The model is not learning the correct trend: If the model is not learning the underlying trend in the data, it may make incorrect predictions. This could be due to a variety of factors, such as a lack of data or a poor choice of model architecture or optimization algorithm.

The model is learning a different trend: It is possible that the model is learning a different trend in the data that is not the true trend. This could be due to a variety of factors, such as noise in the data or a lack of data points.

To address these issues, you can try the suggestions listed above to improve the model's performance. You may also want to consider adjusting the model architecture or using a different type of model, such as a linear regression model, to see if it performs better on this data.

CodePudding user response：

I had to rescale the input and the output data too to be in the [0,1] interval. Here is the modified code, which works exactly as I expected:

import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras.layers import Input, Dense
from tensorflow.python.keras.models import Model
from sklearn.preprocessing import MinMaxScaler

X = np.arange(2002, 2019)
y = np.linspace(21730496, 19473970, 17).astype(float)

X_scalar = MinMaxScaler()
X_scalar.fit(X.reshape(-1, 1))
X_scaled = X_scalar.transform(X.reshape(-1,1))

y_scalar = MinMaxScaler()
y_scalar.fit(y.reshape(-1, 1))
y_scaled = y_scalar.transform(y.reshape(-1,1))

input1 = Input(shape=(1,))
l1 = Dense(10, activation='relu')(input1)
l2 = Dense(50, activation='relu')(l1)
l3 = Dense(50, activation='relu')(l2)
out = Dense(1)(l3)

model = Model(inputs=input1, outputs=[out])
model.compile(
    optimizer='adam',
    loss=['mae']
)

history = model.fit(X_scaled, y_scaled, epochs=500, batch_size=10)

X_test = np.array([2019., 2020., 2021.])
X_test_scaled = X_scalar.transform(X_test.reshape(-1,1))

print(y_scalar.inverse_transform(model.predict(X_test_scaled)))