Home > database >  Listening for 2 or more microphones using Microsoft speech services
Listening for 2 or more microphones using Microsoft speech services

Time:03-31

Good day

I have a project in python where you can talk and get responses from, like a chat. The app is working great, now I want to be able to install two microphones and talk to my assistant from both of my microphones.

But the problem is, I'm using microsoft speech services, and in their examples they haven't shown about using two audio streams or something related to this. I saw their topic on multiple audio recognition with Java, C# and C . No python is supported.

My question is, is there any way I can connect two or more microphones to my laptop and use two audio streams at the same time to get response from my app?

I have python3.9 installed and my code just uses recognize_once() function from Microsoft's examles.

I was thinking is there any way I can run like multi threads and listen for audio from those threads, I have no idea. I did search for topics related to this, but people explain doing this with PyAudio, I use microsoft speech services because my language isn't supported.

Any help would be appreciated, sorry for my english.

CodePudding user response:

For this kind of problem, we can use multiple audio channel array. There is a service called "Microphone array recommendations". There are different array channels and based on the channel count we can include the micro phones. We can include the array of 2,4,7 channels.

2 Microphones - It's a linear channel.

Check the following document to know about the spacing and the microphone array.

Document

You need to make sure that the default Microsoft Azure Kinect DK is enabled or not. Follow the below python code, which is in the running state.

import pyaudio
import wave
import numpy as np

p = pyaudio.PyAudio()

# Find out the index of Azure Kinect Microphone Array
azure_kinect_device_name = "Azure Kinect Microphone Array"
index = -1
for i in range(p.get_device_count()):
    print(p.get_device_info_by_index(i))
    if azure_kinect_device_name in p.get_device_info_by_index(i)["name"]:
        index = i
        break
if index == -1:
    print("Could not find Azure Kinect Microphone Array. Make sure it is properly connected.")
    exit()

# Open the stream for reading audio
input_format = pyaudio.paInt32
input_sample_width = 4
input_channels = 7 #choose your channel count among 2,4,7
input_sample_rate = 48000

stream = p.open(format=input_format, channels=input_channels, rate=input_sample_rate, input=True, input_device_index=index)

# Read frames from microphone and write to wav file
with wave.open("output.wav", "wb") as outfile:
    outfile.setnchannels(1) # We want to write only first channel from each frame
    outfile.setsampwidth(input_sample_width)
    outfile.setframerate(input_sample_rate)

    time_to_read_in_seconds = 5
    frames_to_read = time_to_read_in_seconds * input_sample_rate
    total_frames_read = 0
    while total_frames_read < frames_to_read:
        available_frames = stream.get_read_available()
        read_frames = stream.read(available_frames)
        first_channel_data = np.fromstring(read_frames, dtype=np.int32)[0::7].tobytes()
        outfile.writeframesraw(first_channel_data)
        total_frames_read  = available_frames

stream.stop_stream()
stream.close()

p.terminate()
  • Related