API documentation for speech_v1p1beta1.types
package.
Classes
CreateCustomClassRequest
Message sent by the client for the CreateCustomClass
method.
.. attribute:: parent
Required. The parent resource where this custom class will be created. Format: {api_version}/projects/{project}/locations/{location}/customClasses
:type: str
CreatePhraseSetRequest
Message sent by the client for the CreatePhraseSet
method.
.. attribute:: parent
Required. The parent resource where this phrase set will be created. Format: {api_version}/projects/{project}/locations/{location}/phraseSets
:type: str
CustomClass
A set of words or phrases that represents a common concept likely to appear in your audio, for example a list of passenger ship names. CustomClass items can be substituted into placeholders that you set in PhraseSet phrases.
DeleteCustomClassRequest
Message sent by the client for the DeleteCustomClass
method.
.. attribute:: name
Required. The name of the custom class to delete. Format: {api_version}/projects/{project}/locations/{location}/customClasses/{custom_class}
:type: str
DeletePhraseSetRequest
Message sent by the client for the DeletePhraseSet
method.
.. attribute:: name
Required. The name of the phrase set to delete. Format: {api_version}/projects/{project}/locations/{location}/phraseSets/{phrase_set}
:type: str
GetCustomClassRequest
Message sent by the client for the GetCustomClass
method.
.. attribute:: name
Required. The name of the custom class to retrieve. Format: {api_version}/projects/{project}/locations/{location}/customClasses/{custom_class}
:type: str
GetPhraseSetRequest
Message sent by the client for the GetPhraseSet
method.
.. attribute:: name
Required. The name of the phrase set to retrieve. Format: {api_version}/projects/{project}/locations/{location}/phraseSets/{phrase_set}
:type: str
ListCustomClassesRequest
Message sent by the client for the ListCustomClasses
method.
.. attribute:: parent
Required. The parent, which owns this collection of custom classes. Format: {api_version}/projects/{project}/locations/{location}/customClasses
:type: str
ListCustomClassesResponse
Message returned to the client by the ListCustomClasses
method.
.. attribute:: custom_classes
The custom classes.
:type: Sequence[google.cloud.speech_v1p1beta1.types.CustomClass]
ListPhraseSetRequest
Message sent by the client for the ListPhraseSet
method.
.. attribute:: parent
Required. The parent, which owns this collection of phrase set. Format: projects/{project}/locations/{location}
:type: str
ListPhraseSetResponse
Message returned to the client by the ListPhraseSet
method.
.. attribute:: phrase_sets
The phrase set.
:type: Sequence[google.cloud.speech_v1p1beta1.types.PhraseSet]
LongRunningRecognizeMetadata
Describes the progress of a long-running LongRunningRecognize
call. It is included in the metadata
field of the Operation
returned by the GetOperation
call of the
google::longrunning::Operations
service.
LongRunningRecognizeRequest
The top-level message sent by the client for the
LongRunningRecognize
method.
LongRunningRecognizeResponse
The only message returned to the client by the
LongRunningRecognize
method. It contains the result as zero or
more sequential SpeechRecognitionResult
messages. It is included
in the result.response
field of the Operation
returned by
the GetOperation
call of the google::longrunning::Operations
service.
PhraseSet
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
RecognitionAudio
Contains audio data in the encoding specified in the
RecognitionConfig
. Either content
or uri
must be
supplied. Supplying both or neither returns
google.rpc.Code.INVALID_ARGUMENT][google.rpc.Code.INVALID_ARGUMENT]
.
See content
limits <https://cloud.google.com/speech-to-text/quotas#content>
__.
RecognitionConfig
Provides information to the recognizer that specifies how to process the request.
RecognitionMetadata
Description of audio data to be recognized. .. attribute:: interaction_type
The use case most closely describing the audio content to be recognized.
:type: google.cloud.speech_v1p1beta1.types.RecognitionMetadata.InteractionType
RecognizeRequest
The top-level message sent by the client for the Recognize
method.
RecognizeResponse
The only message returned to the client by the Recognize
method.
It contains the result as zero or more sequential
SpeechRecognitionResult
messages.
SpeakerDiarizationConfig
Config to enable speaker diarization. .. attribute:: enable_speaker_diarization
If 'true', enables speaker detection for each recognized word in the top alternative of the recognition result using a speaker_tag provided in the WordInfo.
:type: bool
SpeechAdaptation
Speech adaptation configuration. .. attribute:: phrase_sets
A collection of phrase sets. To specify the hints inline,
leave the phrase set's name
blank and fill in the rest
of its fields. Any phrase set can use any custom class.
:type: Sequence[google.cloud.speech_v1p1beta1.types.PhraseSet]
SpeechContext
Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
SpeechRecognitionAlternative
Alternative hypotheses (a.k.a. n-best list). .. attribute:: transcript
Transcript text representing the words that the user spoke.
:type: str
SpeechRecognitionResult
A speech recognition result corresponding to a portion of the audio.
StreamingRecognitionConfig
Provides information to the recognizer that specifies how to process the request.
StreamingRecognitionResult
A streaming speech recognition result corresponding to a portion of the audio that is currently being processed.
StreamingRecognizeRequest
The top-level message sent by the client for the
StreamingRecognize
method. Multiple
StreamingRecognizeRequest
messages are sent. The first message
must contain a streaming_config
message and must not contain
audio_content
. All subsequent messages must contain
audio_content
and must not contain a streaming_config
message.
StreamingRecognizeResponse
StreamingRecognizeResponse
is the only message returned to the
client by StreamingRecognize
. A series of zero or more
StreamingRecognizeResponse
messages are streamed back to the
client. If there is no recognizable audio, and single_utterance
is set to false, then no messages are streamed back to the client.
Here's an example of a series of StreamingRecognizeResponse
\ s
that might be returned while processing audio:
results { alternatives { transcript: "tube" } stability: 0.01 }
results { alternatives { transcript: "to be a" } stability: 0.01 }
results { alternatives { transcript: "to be" } stability: 0.9 } results { alternatives { transcript: " or not to be" } stability: 0.01 }
results { alternatives { transcript: "to be or not to be" confidence: 0.92 } alternatives { transcript: "to bee or not to bee" } is_final: true }
results { alternatives { transcript: " that's" } stability: 0.01 }
results { alternatives { transcript: " that is" } stability: 0.9 } results { alternatives { transcript: " the question" } stability: 0.01 }
results { alternatives { transcript: " that is the question" confidence: 0.98 } alternatives { transcript: " that was the question" } is_final: true }
Notes:
Only two of the above responses #4 and #7 contain final results; they are indicated by
is_final: true
. Concatenating these together generates the full transcript: "to be or not to be that is the question".The others contain interim
results
. #3 and #6 contain two interimresults
: the first portion has a high stability and is less likely to change; the second portion has a low stability and is very likely to change. A UI designer might choose to show only high stabilityresults
.The specific
stability
andconfidence
values shown above are only for illustrative purposes. Actual values may vary.In each response, only one of these fields will be set:
error
,speech_event_type
, or one or more (repeated)results
.
TranscriptNormalization
Transcription normalization configuration. Use transcription normalization to automatically replace parts of the transcript with phrases of your choosing. For StreamingRecognize, this normalization only applies to stable partial transcripts (stability > 0.8) and final transcripts.
TranscriptOutputConfig
Specifies an optional destination for the recognition results.
UpdateCustomClassRequest
Message sent by the client for the UpdateCustomClass
method.
.. attribute:: custom_class
Required. The custom class to update.
The custom class's name
field is used to identify the
custom class to be updated. Format:
{api_version}/projects/{project}/locations/{location}/customClasses/{custom_class}
UpdatePhraseSetRequest
Message sent by the client for the UpdatePhraseSet
method.
.. attribute:: phrase_set
Required. The phrase set to update.
The phrase set's name
field is used to identify the set
to be updated. Format:
{api_version}/projects/{project}/locations/{location}/phraseSets/{phrase_set}
WordInfo
Word-specific information for recognized words. .. attribute:: start_time
Time offset relative to the beginning of the audio, and
corresponding to the start of the spoken word. This field is
only set if enable_word_time_offsets=true
and only in
the top hypothesis. This is an experimental feature and the
accuracy of the time offset can vary.
:type: google.protobuf.duration_pb2.Duration