Represents the natural language speech audio to be processed.
| JSON representation |
|---|
{
"config": {
object ( |
| Fields | |
|---|---|
config |
Required. Instructs the speech recognizer how to process the speech audio. |
audio |
Required. The natural language speech audio to be processed. A single request can contain up to 2 minutes of speech audio data. The transcribed text cannot contain more than 256 bytes for virtual agent interactions. A base64-encoded string. |