Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Mit der Modellanpassung können Sie die Genauigkeit der Transkriptionsergebnisse verbessern, die Sie von Speech-to-Text erhalten. Mit der Funktion zur Modellanpassung können Sie Wörter und/oder Wortgruppen angeben, die von Speech-to-Text in Ihren Audiodaten häufiger erkannt werden müssen als andere Alternativen. Die Modellanpassung ist in den folgenden Anwendungsfällen besonders nützlich, um die Transkriptionsgenauigkeit zu verbessern:
Ihr Audio enthält Wörter oder Wortgruppen, die möglicherweise häufig auftreten.
Ihre Audiodaten enthalten wahrscheinlich seltene Wörter (z. B. Eigennamen) oder Wörter, die nicht zum üblichen Sprachgebrauch gehören.
Ihre Audiodateien enthalten Rauschen oder sind anderweitig undeutlich.
Die Sprachanpassung ist eine optionale Speech-to-Text-Konfiguration, mit der Sie Ihre Transkriptionsergebnisse an Ihre Anforderungen anpassen können. Weitere Informationen zum Konfigurieren des Texts der Erkennungsanfrage finden Sie in der RecognitionConfig-Dokumentation.
Im folgenden Codebeispiel wird gezeigt, wie Sie die Transkriptionsgenauigkeit mit einer SpeechAdaptation-Ressource verbessern können: PhraseSet, CustomClass und optimierte Modellanpassung.
Notieren Sie sich zur Verwendung von PhraseSet oder CustomClass in zukünftigen Anfragen die zugehörige Ressource name, die beim Erstellen der Ressource in der Antwort zurückgegeben wird.
Eine Liste der für Ihre Sprache verfügbaren vordefinierten Klassen finden Sie unter Unterstützte Klassentokens.
importosfromgoogle.cloudimportspeech_v1p1beta1asspeechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")deftranscribe_with_model_adaptation(audio_uri:str,custom_class_id:str,phrase_set_id:str,)-> str:"""Create `PhraseSet` and `CustomClasses` for custom item lists in input data. Args: audio_uri (str): The Cloud Storage URI of the input audio. e.g. gs://[BUCKET]/[FILE] custom_class_id (str): The unique ID of the custom class to create phrase_set_id (str): The unique ID of the PhraseSet to create. Returns: The transcript of the input audio. """# Specifies the location where the Speech API will be accessed.location="global"# Audio objectaudio=speech.RecognitionAudio(uri=audio_uri)# Create the adaptation clientadaptation_client=speech.AdaptationClient()# The parent resource where the custom class and phrase set will be created.parent=f"projects/{PROJECT_ID}/locations/{location}"# Create the custom class resourceadaptation_client.create_custom_class({"parent":parent,"custom_class_id":custom_class_id,"custom_class":{"items":[{"value":"sushido"},{"value":"altura"},{"value":"taneda"},]},})custom_class_name=(f"projects/{PROJECT_ID}/locations/{location}/customClasses/{custom_class_id}")# Create the phrase set resourcephrase_set_response=adaptation_client.create_phrase_set({"parent":parent,"phrase_set_id":phrase_set_id,"phrase_set":{"boost":10,"phrases":[{"value":f"Visit restaurants like ${{{custom_class_name}}}"}],},})phrase_set_name=phrase_set_response.name# The next section shows how to use the newly created custom# class and phrase set to send a transcription request with speech adaptation# Speech adaptation configurationspeech_adaptation=speech.SpeechAdaptation(phrase_set_references=[phrase_set_name])# speech configuration objectconfig=speech.RecognitionConfig(encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,sample_rate_hertz=24000,language_code="en-US",adaptation=speech_adaptation,)# Create the speech clientspeech_client=speech.SpeechClient()response=speech_client.recognize(config=config,audio=audio)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],["Zuletzt aktualisiert: 2025-08-18 (UTC)."],[],[],null,["# Send a recognition request with model adaptation\n\nYou can improve the accuracy of the transcription results you\nget from Speech-to-Text by using **model adaptation**. The model\nadaptation feature lets you specify words and/or phrases that\nSpeech-to-Text must recognize more frequently in your audio data than\nother alternatives that might otherwise be suggested. Model adaptation is\nparticularly useful for improving transcription accuracy in the following use\ncases:\n\n1. Your audio contains words or phrases that are likely to occur frequently.\n2. Your audio is likely to contain words that are rare (such as proper names) or words that do not exist in general use.\n3. Your audio contains noise or is otherwise not very clear.\n\nFor more information about using this feature, see\n[Improve transcription results with model adaptation](/speech-to-text/docs/adaptation-model).\nFor information about phrase and character limits per model adaptation request,\nsee [Quotas and limits](/speech-to-text/quotas). Not all models\nsupport speech adaptation. See [Language Support](/speech-to-text/docs/speech-to-text-supported-languages)\nto see which models support adaptation.\n\nCode sample\n-----------\n\nSpeech Adaptation is an optional Speech-to-Text configuration that you\ncan use to customize your transcription results according to your needs. See the\n[`RecognitionConfig`](/speech-to-text/docs/reference/rest/v1/RecognitionConfig)\ndocumentation for more information about configuring the recognition request\nbody.\n\nThe following code sample shows how to improve transcription accuracy using a\n[SpeechAdaptation](/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig#speechadaptation)\nresource:\n[`PhraseSet`](/speech-to-text/docs/reference/rest/v1p1beta1/projects.locations.phraseSets),\n[`CustomClass`](/speech-to-text/docs/reference/rest/v1p1beta1/projects.locations.customClasses),\nand [model adaptation boost](/speech-to-text/docs/adaptation-model#fine-tune_transcription_results_using_boost_beta).\nTo use a `PhraseSet` or `CustomClass` in future requests, make a note of its\nresource `name`, returned in the response when you create the resource.\n\nFor a list of the pre-built classes available for your language, see\n[Supported class tokens](/speech-to-text/docs/class-tokens). \n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import os\n\n from google.cloud import speech_v1p1beta1 as speech\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def transcribe_with_model_adaptation(\n audio_uri: str,\n custom_class_id: str,\n phrase_set_id: str,\n ) -\u003e str:\n \"\"\"Create `PhraseSet` and `CustomClasses` for custom item lists in input data.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio. e.g. gs://[BUCKET]/[FILE]\n custom_class_id (str): The unique ID of the custom class to create\n phrase_set_id (str): The unique ID of the PhraseSet to create.\n Returns:\n The transcript of the input audio.\n \"\"\"\n # Specifies the location where the Speech API will be accessed.\n location = \"global\"\n\n # Audio object\n audio = speech.RecognitionAudio(uri=audio_uri)\n\n # Create the adaptation client\n adaptation_client = speech.AdaptationClient()\n\n # The parent resource where the custom class and phrase set will be created.\n parent = f\"projects/{PROJECT_ID}/locations/{location}\"\n\n # Create the custom class resource\n adaptation_client.create_custom_class(\n {\n \"parent\": parent,\n \"custom_class_id\": custom_class_id,\n \"custom_class\": {\n \"items\": [\n {\"value\": \"sushido\"},\n {\"value\": \"altura\"},\n {\"value\": \"taneda\"},\n ]\n },\n }\n )\n custom_class_name = (\n f\"projects/{PROJECT_ID}/locations/{location}/customClasses/{custom_class_id}\"\n )\n # Create the phrase set resource\n phrase_set_response = adaptation_client.create_phrase_set(\n {\n \"parent\": parent,\n \"phrase_set_id\": phrase_set_id,\n \"phrase_set\": {\n \"boost\": 10,\n \"phrases\": [\n {\"value\": f\"Visit restaurants like ${{{custom_class_name}}}\"}\n ],\n },\n }\n )\n phrase_set_name = phrase_set_response.name\n # The next section shows how to use the newly created custom\n # class and phrase set to send a transcription request with speech adaptation\n\n # Speech adaptation configuration\n speech_adaptation = speech.SpeechAdaptation(phrase_set_references=[phrase_set_name])\n\n # speech configuration object\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=24000,\n language_code=\"en-US\",\n adaptation=speech_adaptation,\n )\n\n # Create the speech client\n speech_client = speech.SpeechClient()\n\n response = speech_client.recognize(config=config, audio=audio)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n\n\u003cbr /\u003e"]]