When training a neural network with a lot of batches, the model will have "forgotten" the first batches it saw and predict more accurately (also overfit) the samples similar to the last batches it was trained on ?
Is that correct ?
Is there a way to correct that ?
CodePudding user response:
Yes, that is generally correct. When training a neural network with a lot of batches, the model will have "forgotten" the first batches it saw and predict more accurately the samples similar to the last batches it was trained on. This is known as the problem of catastrophic forgetting.
There are several ways to address this problem, including:
- Using a technique called "rehearsal" where examples from the earlier batches are periodically reintroduced to the model during training.
- Using a technique called "elastic weight consolidation" (EWC) which aims to preserve the model's performance on earlier tasks by constraining the changes to the model's parameters.
- Using a technique called "synaptic intelligence" (SI) which aims to prevent catastrophic forgetting by slowing down the rate of change of the model's parameters.
Another approach is to use techniques like regularization, early stopping and Dropout for overfitting prevention.