WER for wav2vec2-base model remains as 1 throughout the whole training process-CodePudding

I am trying to run the wav2vec2 speech recognition model as shared in https://huggingface.co/docs/transformers/tasks/asr

This is the loss and WER during the training process, whereby the validation loss is reducing significantly, whereas the WER remains as 1.

I tried to print out the predicted and label values and this is what I got for the last 3 outputs, which results in the WER = 1.

This is the set of parameters of the model. model param.

What may actually go wrong here? Please help.. Thanks!

I have tried tuning the hyperparameters and hoping to reduce the WER.

CodePudding user response：

Thank you for providing some useful information for troubleshooting.

Your loss is reducing, which shows that the model is training, however your learning rate of 0.01 is very high. Consider changing this to something like 1e-5 as shown in the example on Hugging Face.
The other thing I noticed was that all your input text is in UPPER CASE LIKE THIS. Depending on the training data used for the original model, it may not be expecting upper case text. Try lower-casing your text to see if that yields a lower WER.
Your save_steps and eval_steps are also both far too high. This is how far the model "looks backwards" to evaluate - with a count of 1 on both these parameters, the model doesn't have enough history to compare better predictions. Increase these parameters and try again.