Instructs the speech synthesizer on how to generate the output audio content. If this audio config is supplied in a request, it overrides all existing text-to-speech settings applied to the agent.
Required. Audio encoding of the synthesized audio content.
sampleRateHertz
integer
The synthesis sample rate (in hertz) for this audio. If not provided, then the synthesizer will use the default sample rate based on the audio encoding. If this is different from the voice's natural sample rate, then the synthesizer will honor this request by converting to the desired sample rate (which might result in worse audio quality).
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-03-05 UTC."],[[["This section outlines the JSON structure for configuring audio output from a speech synthesizer."],["The `audioEncoding` field specifies the required encoding format for the generated audio."],["`sampleRateHertz` allows users to define the sample rate of the synthesized audio, or the default based on the encoding will be used."],["The `synthesizeSpeechConfig` field lets you set detailed parameters for how the speech is synthesized."]]],[]]