For I project of mine, I need to check how loud my surroundings are. I combined two pieces of code to create a pretty accurate decibel meter:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import pyaudio
import numpy as np
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
pa = pyaudio.PyAudio()
stream = pa.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
buffer = []
while True:
string_audio_data = stream.read(CHUNK)
audio_data = np.fromstring(string_audio_data, dtype=np.short)
loudness = 20*np.log10(np.sqrt(np.mean(np.absolute(audio_data)**2)))
print(loudness)
Thanks to K K on SO for the math and thanks to YuanGao on cmsdk.com for the recording of sound.
This returns the loudness in decibels pretty accurately, at least compared to other pretty reliable decibel meters.
However, as soon as the audio starts to get a bit louder, the script often returns either nan
or numbers that make no sense.
Is there an inherent mistake I'm overlooking or is that just a limitation of this method?
CodePudding user response:
Your data is of type np.int16. You're taking the absolute value and then squaring it, but all of this arithmetic is taking place using 16-bit arithmetic.
As an example of the failure, try the following in your interpreter
>>> x = np.arange(30000, 30100, dtype=np.short)
>>> print(x * x)
and you'll see the problem. Approximately half the results are negative numbers because they are being truncated to 16 bits. There is a good chance that your sum is negative, and that the logarithm then gives you NaN
.
This also explains why it only happens when the music is loud. Soft music won't overflow.
So change
audio_data = np.fromstring(string_audio_data, dtype=np.short)
to
audio_data = np.fromstring(string_audio_data, dtype=np.short).astype(float)
and your code will work as intended.