How to adjust Pronunciation Pitch in Google Text to Speech API-CodePudding

I used the Google Text2Speech API, it works well but I'd like to adjust the pitch. I used the gTTS.

tts = gTTS("ご返信ありがとうございます。", lang = 'ja')

How should I go ahead? Thanks in advance!

CodePudding user response：

Looking through the official documentation the text2speech API has an AudioConfig function where you can pass in the pitch. The pitch can be changed in the range [-20.0, 20.0]. Here is a workinng example.

from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
    language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)

# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
    pitch=-1.20,
    audio_encoding=texttospeech.AudioEncoding.MP3
)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
    input=synthesis_input, voice=voice, audio_config=audio_config
)

# The response's audio_content is binary.
with open("output.mp3", "wb") as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')