Home > Software engineering >  How to convert wav (audio) file from mel spectrogram?
How to convert wav (audio) file from mel spectrogram?

Time:10-03

I am working on an audio ML problem. I am able to convert a given audio from WAV to MEL-Spectogram using tensorflow's this document.

My usecase is basically one step more than this. Once I have a mel-spectrogram, I want to reconstruct the audio file from it. Put it simply, spectrogram to wav conversion.

Could any one please help me?

CodePudding user response:

I found a solution that works, as suggested by @ForamJ in the comment, however it took me 30mins to convert 1min audio.

# step1 - converting a wav file to numpy array and then converting that to mel-spectrogram
my_audio_as_np_array, my_sample_rate= librosa.load("audio1.wav")

# step2 - converting audio np array to spectrogram
spec = librosa.feature.melspectrogram(y=my_audio_as_np_array,
                                        sr=my_sample_rate, 
                                            n_fft=2048, 
                                            hop_length=512, 
                                            win_length=None, 
                                            window='hann', 
                                            center=True, 
                                            pad_mode='reflect', 
                                            power=2.0,
                                     n_mels=128)

# step3 converting mel-spectrogrma back to wav file
res = librosa.feature.inverse.mel_to_audio(spec, 
                                           sr=my_sample_rate, 
                                           n_fft=2048, 
                                           hop_length=512, 
                                           win_length=None, 
                                           window='hann', 
                                           center=True, 
                                           pad_mode='reflect', 
                                           power=2.0, 
                                           n_iter=32)

# step4 - save it as a wav file
import soundfile as sf
sf.write("test1.wav", res, sashi_sr)
  • Related