Audio recognition and fingerprint using sklean & librosa-CodePudding

I want to create a model that can predict who has speak with different word.

In this case i try to use feature

Mfcc
Melspectogram
Tempo
Chroma stft
Spectral Centroid
Spectral Bandwidth
Tempo

And for train that i am use RandomforestRegressor

It's possible to create model like that?

CodePudding user response：

For the sound processing and feature extraction part, librosa is definitely going to provide you all you need.

For the machine learning part however, speaker identification (also called "voice recognition") is a relatively complex task. You probably will get more success using techniques from deep learning. You can certainly try to use random forests if you like, but you'll probably get a lower accuracy and will have to spend more time doing feature engineering. In fact, it will be a good exercise for you to compare the results you can get with the various techniques.

For an example tutorial on speaker identification using Keras, see e.g. this article.