import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import librosa as lr
import glob
path = r'/content/drive/MyDrive/ESC-50/305 - Coughing/*.ogg'
a = glob.glob(path)
print(len(a))
for file in range(0,len(a),1):
#scale, sr = librosa.load(a[file])
#print(sr)
scale, sr = librosa.load(a[file])
mel_spectrogram = librosa.feature.melspectrogram( scale, sr=sr, n_fft=1024, hop_length=512,
n_mels=228
)
mel_spectrogram.shape
log_mel_spectrogram = librosa.power_to_db((mel_spectrogram))
log_mel_spectrogram.shape
plt.figure(figsize=(10, 5))
librosa.display.specshow(log_mel_spectrogram, x_axis="time",
y_axis="log",
sr=sr)
plt.colorbar(format="% 2.f")
plt.show()
I am trying to read audio and convert it into mel spectrogram for the training of machine learning model but I am getting different spectrogram from the audio of the same size and have same sampling frequency for each audio I want to get spectrograph of same background so that I can get better accuracy for my machine learning model.
https://i.stack.imgur.com/beDR8.png
CodePudding user response:
The values of your spectrogram looks reasonable, and to be generally in the same range for all the audio clips. But you have not specified the color map when plotting, so some of them have different color maps due to the autodetection in librosa. Specify cmap='magma' for librosa.display.specshow and that should not be a problem.
Note that for machine learning, you should not use the plot of the spectrogram, but the spectrogram values directly. If you want an image representation of that, see https://stackoverflow.com/a/57204349/1967571