Class RecognitionMetadata (2.25.1)

RecognitionMetadata(mapping=None, *, ignore_unknown_fields=False, **kwargs)

Description of audio data to be recognized.

Attributes

NameDescription
interaction_type google.cloud.speech_v1.types.RecognitionMetadata.InteractionType
The use case most closely describing the audio content to be recognized.
industry_naics_code_of_audio int
The industry vertical to which this speech recognition request most closely applies. This is most indicative of the topics contained in the audio. Use the 6-digit NAICS code to identify the industry vertical - see https://www.naics.com/search/.
microphone_distance google.cloud.speech_v1.types.RecognitionMetadata.MicrophoneDistance
The audio type that most closely describes the audio being recognized.
original_media_type google.cloud.speech_v1.types.RecognitionMetadata.OriginalMediaType
The original media the speech was recorded on.
recording_device_type google.cloud.speech_v1.types.RecognitionMetadata.RecordingDeviceType
The type of device the speech was recorded with.
recording_device_name str
The device used to make the recording. Examples 'Nexus 5X' or 'Polycom SoundStation IP 6000' or 'POTS' or 'VoIP' or 'Cardioid Microphone'.
original_mime_type str
Mime type of the original audio file. For example audio/m4a, audio/x-alaw-basic, audio/mp3, audio/3gpp. A list of possible audio mime types is maintained at http://www.iana.org/assignments/media-types/media-types.xhtml#audio
audio_topic str
Description of the content. Eg. "Recordings of federal supreme court hearings from 2012".

Classes

InteractionType

InteractionType(value)

Use case categories that the audio recognition request can be described by.

Values: INTERACTION_TYPE_UNSPECIFIED (0): Use case is either unknown or is something other than one of the other values below. DISCUSSION (1): Multiple people in a conversation or discussion. For example in a meeting with two or more people actively participating. Typically all the primary people speaking would be in the same room (if not, see PHONE_CALL) PRESENTATION (2): One or more persons lecturing or presenting to others, mostly uninterrupted. PHONE_CALL (3): A phone-call or video-conference in which two or more people, who are not in the same room, are actively participating. VOICEMAIL (4): A recorded message intended for another person to listen to. PROFESSIONALLY_PRODUCED (5): Professionally produced audio (eg. TV Show, Podcast). VOICE_SEARCH (6): Transcribe spoken questions and queries into text. VOICE_COMMAND (7): Transcribe voice commands, such as for controlling a device. DICTATION (8): Transcribe speech to text to create a written document, such as a text-message, email or report.

MicrophoneDistance

MicrophoneDistance(value)

Enumerates the types of capture settings describing an audio file.

Values: MICROPHONE_DISTANCE_UNSPECIFIED (0): Audio type is not known. NEARFIELD (1): The audio was captured from a closely placed microphone. Eg. phone, dictaphone, or handheld microphone. Generally if there speaker is within 1 meter of the microphone. MIDFIELD (2): The speaker if within 3 meters of the microphone. FARFIELD (3): The speaker is more than 3 meters away from the microphone.

OriginalMediaType

OriginalMediaType(value)

The original media the speech was recorded on.

Values: ORIGINAL_MEDIA_TYPE_UNSPECIFIED (0): Unknown original media type. AUDIO (1): The speech data is an audio recording. VIDEO (2): The speech data originally recorded on a video.

RecordingDeviceType

RecordingDeviceType(value)

The type of device the speech was recorded with.

Values: RECORDING_DEVICE_TYPE_UNSPECIFIED (0): The recording device is unknown. SMARTPHONE (1): Speech was recorded on a smartphone. PC (2): Speech was recorded using a personal computer or tablet. PHONE_LINE (3): Speech was recorded over a phone line. VEHICLE (4): Speech was recorded in a vehicle. OTHER_OUTDOOR_DEVICE (5): Speech was recorded outdoors. OTHER_INDOOR_DEVICE (6): Speech was recorded indoors.