Enviar uma solicitação de reconhecimento com a adaptação de modelo
Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
Melhore a acurácia dos resultados de transcrição que você recebe do
Speech-to-Text usando a adaptação de fala. O recurso de
adaptação do modelo permite especificar palavras e/ou frases que
o Speech-to-Text precisa reconhecer com mais frequência nos seus dados de áudio do que
outras alternativas que podem ser sugeridas. A adaptação do modelo é
especialmente útil para melhorar a acurácia da transcrição nos seguintes casos
de uso:
Seu áudio contém palavras ou frases que provavelmente ocorrerão com frequência.
É provável que seu áudio contenha palavras raras
(como nomes próprios) ou palavras que não existem no uso geral.
Seu áudio contém ruído ou não é muito claro.
Para mais informações sobre como usar esse recurso, consulte
Melhorar os resultados de transcrição com a adaptação do modelo.
Para informações sobre limites de frases e caracteres por solicitação de adaptação do modelo,
consulte Cotas e limites. Nem todos os modelos
são compatíveis com a adaptação da fala. Consulte Suporte a idiomas
para ver quais modelos são compatíveis com a adaptação.
Exemplo de código
A adaptação da fala é uma configuração opcional do Speech-to-Text que
pode ser usada para personalizar os resultados da transcrição de acordo com suas necessidades. Consulte a documentação RecognitionConfig
para mais informações sobre como configurar o corpo da solicitação
de reconhecimento.
O exemplo de código a seguir demonstra como melhorar a acurácia da transcrição usando um
recurso
SpeechAdaptation:
PhraseSet,
CustomClass
e a otimização da adaptação do modelo.
Para usar um PhraseSet ou um CustomClass em solicitações futuras, anote
o recurso name, retornado na resposta ao criar o recurso.
importosfromgoogle.cloudimportspeech_v1p1beta1asspeechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")deftranscribe_with_model_adaptation(audio_uri:str,custom_class_id:str,phrase_set_id:str,)-> str:"""Create `PhraseSet` and `CustomClasses` for custom item lists in input data. Args: audio_uri (str): The Cloud Storage URI of the input audio. e.g. gs://[BUCKET]/[FILE] custom_class_id (str): The unique ID of the custom class to create phrase_set_id (str): The unique ID of the PhraseSet to create. Returns: The transcript of the input audio. """# Specifies the location where the Speech API will be accessed.location="global"# Audio objectaudio=speech.RecognitionAudio(uri=audio_uri)# Create the adaptation clientadaptation_client=speech.AdaptationClient()# The parent resource where the custom class and phrase set will be created.parent=f"projects/{PROJECT_ID}/locations/{location}"# Create the custom class resourceadaptation_client.create_custom_class({"parent":parent,"custom_class_id":custom_class_id,"custom_class":{"items":[{"value":"sushido"},{"value":"altura"},{"value":"taneda"},]},})custom_class_name=(f"projects/{PROJECT_ID}/locations/{location}/customClasses/{custom_class_id}")# Create the phrase set resourcephrase_set_response=adaptation_client.create_phrase_set({"parent":parent,"phrase_set_id":phrase_set_id,"phrase_set":{"boost":10,"phrases":[{"value":f"Visit restaurants like ${{{custom_class_name}}}"}],},})phrase_set_name=phrase_set_response.name# The next section shows how to use the newly created custom# class and phrase set to send a transcription request with speech adaptation# Speech adaptation configurationspeech_adaptation=speech.SpeechAdaptation(phrase_set_references=[phrase_set_name])# speech configuration objectconfig=speech.RecognitionConfig(encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,sample_rate_hertz=24000,language_code="en-US",adaptation=speech_adaptation,)# Create the speech clientspeech_client=speech.SpeechClient()response=speech_client.recognize(config=config,audio=audio)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-29 UTC."],[],[],null,["# Send a recognition request with model adaptation\n\nYou can improve the accuracy of the transcription results you\nget from Speech-to-Text by using **model adaptation**. The model\nadaptation feature lets you specify words and/or phrases that\nSpeech-to-Text must recognize more frequently in your audio data than\nother alternatives that might otherwise be suggested. Model adaptation is\nparticularly useful for improving transcription accuracy in the following use\ncases:\n\n1. Your audio contains words or phrases that are likely to occur frequently.\n2. Your audio is likely to contain words that are rare (such as proper names) or words that do not exist in general use.\n3. Your audio contains noise or is otherwise not very clear.\n\nFor more information about using this feature, see\n[Improve transcription results with model adaptation](/speech-to-text/docs/adaptation-model).\nFor information about phrase and character limits per model adaptation request,\nsee [Quotas and limits](/speech-to-text/quotas). Not all models\nsupport speech adaptation. See [Language Support](/speech-to-text/docs/speech-to-text-supported-languages)\nto see which models support adaptation.\n\nCode sample\n-----------\n\nSpeech Adaptation is an optional Speech-to-Text configuration that you\ncan use to customize your transcription results according to your needs. See the\n[`RecognitionConfig`](/speech-to-text/docs/reference/rest/v1/RecognitionConfig)\ndocumentation for more information about configuring the recognition request\nbody.\n\nThe following code sample shows how to improve transcription accuracy using a\n[SpeechAdaptation](/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig#speechadaptation)\nresource:\n[`PhraseSet`](/speech-to-text/docs/reference/rest/v1p1beta1/projects.locations.phraseSets),\n[`CustomClass`](/speech-to-text/docs/reference/rest/v1p1beta1/projects.locations.customClasses),\nand [model adaptation boost](/speech-to-text/docs/adaptation-model#fine-tune_transcription_results_using_boost_beta).\nTo use a `PhraseSet` or `CustomClass` in future requests, make a note of its\nresource `name`, returned in the response when you create the resource.\n\nFor a list of the pre-built classes available for your language, see\n[Supported class tokens](/speech-to-text/docs/class-tokens). \n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import os\n\n from google.cloud import speech_v1p1beta1 as speech\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def transcribe_with_model_adaptation(\n audio_uri: str,\n custom_class_id: str,\n phrase_set_id: str,\n ) -\u003e str:\n \"\"\"Create `PhraseSet` and `CustomClasses` for custom item lists in input data.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio. e.g. gs://[BUCKET]/[FILE]\n custom_class_id (str): The unique ID of the custom class to create\n phrase_set_id (str): The unique ID of the PhraseSet to create.\n Returns:\n The transcript of the input audio.\n \"\"\"\n # Specifies the location where the Speech API will be accessed.\n location = \"global\"\n\n # Audio object\n audio = speech.RecognitionAudio(uri=audio_uri)\n\n # Create the adaptation client\n adaptation_client = speech.AdaptationClient()\n\n # The parent resource where the custom class and phrase set will be created.\n parent = f\"projects/{PROJECT_ID}/locations/{location}\"\n\n # Create the custom class resource\n adaptation_client.create_custom_class(\n {\n \"parent\": parent,\n \"custom_class_id\": custom_class_id,\n \"custom_class\": {\n \"items\": [\n {\"value\": \"sushido\"},\n {\"value\": \"altura\"},\n {\"value\": \"taneda\"},\n ]\n },\n }\n )\n custom_class_name = (\n f\"projects/{PROJECT_ID}/locations/{location}/customClasses/{custom_class_id}\"\n )\n # Create the phrase set resource\n phrase_set_response = adaptation_client.create_phrase_set(\n {\n \"parent\": parent,\n \"phrase_set_id\": phrase_set_id,\n \"phrase_set\": {\n \"boost\": 10,\n \"phrases\": [\n {\"value\": f\"Visit restaurants like ${{{custom_class_name}}}\"}\n ],\n },\n }\n )\n phrase_set_name = phrase_set_response.name\n # The next section shows how to use the newly created custom\n # class and phrase set to send a transcription request with speech adaptation\n\n # Speech adaptation configuration\n speech_adaptation = speech.SpeechAdaptation(phrase_set_references=[phrase_set_name])\n\n # speech configuration object\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=24000,\n language_code=\"en-US\",\n adaptation=speech_adaptation,\n )\n\n # Create the speech client\n speech_client = speech.SpeechClient()\n\n response = speech_client.recognize(config=config, audio=audio)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n\n\u003cbr /\u003e"]]