Mengirim permintaan pengenalan dengan adaptasi model
Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Anda dapat meningkatkan akurasi hasil transkripsi yang diperoleh dari Speech-to-Text menggunakan adaptasi model. Dengan fitur adaptasi model, Anda dapat menentukan kata dan/atau frasa yang harus dikenali lebih sering oleh Speech-to-Text dalam data audio Anda, dibandingkan alternatif lain yang mungkin disarankan. Adaptasi model sangat berguna untuk meningkatkan akurasi transkripsi dalam kasus penggunaan berikut:
Audio Anda berisi kata atau frasa yang mungkin sering muncul.
Audio Anda kemungkinan berisi kata-kata yang jarang digunakan (seperti nama diri) atau kata yang tidak ada dalam penggunaan umum.
Audio Anda berisi derau atau tidak terlalu jelas.
Untuk informasi selengkapnya tentang cara menggunakan fitur ini, lihat Meningkatkan hasil transkripsi dengan adaptasi model.
Untuk informasi tentang batas frasa dan karakter per permintaan adaptasi model, lihat Kuota dan batas. Tidak semua model mendukung adaptasi ucapan. Lihat Dukungan Bahasa
untuk mengetahui model yang mendukung adaptasi.
Contoh kode
Adaptasi Ucapan adalah konfigurasi Speech-to-Text opsional yang dapat Anda gunakan untuk menyesuaikan hasil transkripsi sesuai kebutuhan. Lihat dokumentasi RecognitionConfig untuk informasi selengkapnya tentang cara mengonfigurasi isi permintaan pengenalan.
Contoh kode berikut menunjukkan cara meningkatkan akurasi transkripsi menggunakan resource SpeechAdaptation: PhraseSet, CustomClass, dan peningkatan adaptasi model.
Untuk menggunakan PhraseSet atau CustomClass dalam permintaan berikutnya, catat name resource-nya, yang ditampilkan dalam respons saat Anda membuat resource tersebut.
Untuk mengetahui daftar class bawaan yang tersedia untuk bahasa Anda, lihat Token class yang didukung.
importosfromgoogle.cloudimportspeech_v1p1beta1asspeechPROJECT_ID=os.getenv("GOOGLE_CLOUD_PROJECT")deftranscribe_with_model_adaptation(audio_uri:str,custom_class_id:str,phrase_set_id:str,)-> str:"""Create `PhraseSet` and `CustomClasses` for custom item lists in input data. Args: audio_uri (str): The Cloud Storage URI of the input audio. e.g. gs://[BUCKET]/[FILE] custom_class_id (str): The unique ID of the custom class to create phrase_set_id (str): The unique ID of the PhraseSet to create. Returns: The transcript of the input audio. """# Specifies the location where the Speech API will be accessed.location="global"# Audio objectaudio=speech.RecognitionAudio(uri=audio_uri)# Create the adaptation clientadaptation_client=speech.AdaptationClient()# The parent resource where the custom class and phrase set will be created.parent=f"projects/{PROJECT_ID}/locations/{location}"# Create the custom class resourceadaptation_client.create_custom_class({"parent":parent,"custom_class_id":custom_class_id,"custom_class":{"items":[{"value":"sushido"},{"value":"altura"},{"value":"taneda"},]},})custom_class_name=(f"projects/{PROJECT_ID}/locations/{location}/customClasses/{custom_class_id}")# Create the phrase set resourcephrase_set_response=adaptation_client.create_phrase_set({"parent":parent,"phrase_set_id":phrase_set_id,"phrase_set":{"boost":10,"phrases":[{"value":f"Visit restaurants like ${{{custom_class_name}}}"}],},})phrase_set_name=phrase_set_response.name# The next section shows how to use the newly created custom# class and phrase set to send a transcription request with speech adaptation# Speech adaptation configurationspeech_adaptation=speech.SpeechAdaptation(phrase_set_references=[phrase_set_name])# speech configuration objectconfig=speech.RecognitionConfig(encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,sample_rate_hertz=24000,language_code="en-US",adaptation=speech_adaptation,)# Create the speech clientspeech_client=speech.SpeechClient()response=speech_client.recognize(config=config,audio=audio)forresultinresponse.results:print(f"Transcript: {result.alternatives[0].transcript}")
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[],[],null,["# Send a recognition request with model adaptation\n\nYou can improve the accuracy of the transcription results you\nget from Speech-to-Text by using **model adaptation**. The model\nadaptation feature lets you specify words and/or phrases that\nSpeech-to-Text must recognize more frequently in your audio data than\nother alternatives that might otherwise be suggested. Model adaptation is\nparticularly useful for improving transcription accuracy in the following use\ncases:\n\n1. Your audio contains words or phrases that are likely to occur frequently.\n2. Your audio is likely to contain words that are rare (such as proper names) or words that do not exist in general use.\n3. Your audio contains noise or is otherwise not very clear.\n\nFor more information about using this feature, see\n[Improve transcription results with model adaptation](/speech-to-text/docs/adaptation-model).\nFor information about phrase and character limits per model adaptation request,\nsee [Quotas and limits](/speech-to-text/quotas). Not all models\nsupport speech adaptation. See [Language Support](/speech-to-text/docs/speech-to-text-supported-languages)\nto see which models support adaptation.\n\nCode sample\n-----------\n\nSpeech Adaptation is an optional Speech-to-Text configuration that you\ncan use to customize your transcription results according to your needs. See the\n[`RecognitionConfig`](/speech-to-text/docs/reference/rest/v1/RecognitionConfig)\ndocumentation for more information about configuring the recognition request\nbody.\n\nThe following code sample shows how to improve transcription accuracy using a\n[SpeechAdaptation](/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig#speechadaptation)\nresource:\n[`PhraseSet`](/speech-to-text/docs/reference/rest/v1p1beta1/projects.locations.phraseSets),\n[`CustomClass`](/speech-to-text/docs/reference/rest/v1p1beta1/projects.locations.customClasses),\nand [model adaptation boost](/speech-to-text/docs/adaptation-model#fine-tune_transcription_results_using_boost_beta).\nTo use a `PhraseSet` or `CustomClass` in future requests, make a note of its\nresource `name`, returned in the response when you create the resource.\n\nFor a list of the pre-built classes available for your language, see\n[Supported class tokens](/speech-to-text/docs/class-tokens). \n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n import os\n\n from google.cloud import speech_v1p1beta1 as speech\n\n PROJECT_ID = os.getenv(\"GOOGLE_CLOUD_PROJECT\")\n\n\n def transcribe_with_model_adaptation(\n audio_uri: str,\n custom_class_id: str,\n phrase_set_id: str,\n ) -\u003e str:\n \"\"\"Create `PhraseSet` and `CustomClasses` for custom item lists in input data.\n Args:\n audio_uri (str): The Cloud Storage URI of the input audio. e.g. gs://[BUCKET]/[FILE]\n custom_class_id (str): The unique ID of the custom class to create\n phrase_set_id (str): The unique ID of the PhraseSet to create.\n Returns:\n The transcript of the input audio.\n \"\"\"\n # Specifies the location where the Speech API will be accessed.\n location = \"global\"\n\n # Audio object\n audio = speech.RecognitionAudio(uri=audio_uri)\n\n # Create the adaptation client\n adaptation_client = speech.AdaptationClient()\n\n # The parent resource where the custom class and phrase set will be created.\n parent = f\"projects/{PROJECT_ID}/locations/{location}\"\n\n # Create the custom class resource\n adaptation_client.create_custom_class(\n {\n \"parent\": parent,\n \"custom_class_id\": custom_class_id,\n \"custom_class\": {\n \"items\": [\n {\"value\": \"sushido\"},\n {\"value\": \"altura\"},\n {\"value\": \"taneda\"},\n ]\n },\n }\n )\n custom_class_name = (\n f\"projects/{PROJECT_ID}/locations/{location}/customClasses/{custom_class_id}\"\n )\n # Create the phrase set resource\n phrase_set_response = adaptation_client.create_phrase_set(\n {\n \"parent\": parent,\n \"phrase_set_id\": phrase_set_id,\n \"phrase_set\": {\n \"boost\": 10,\n \"phrases\": [\n {\"value\": f\"Visit restaurants like ${{{custom_class_name}}}\"}\n ],\n },\n }\n )\n phrase_set_name = phrase_set_response.name\n # The next section shows how to use the newly created custom\n # class and phrase set to send a transcription request with speech adaptation\n\n # Speech adaptation configuration\n speech_adaptation = speech.SpeechAdaptation(phrase_set_references=[phrase_set_name])\n\n # speech configuration object\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=24000,\n language_code=\"en-US\",\n adaptation=speech_adaptation,\n )\n\n # Create the speech client\n speech_client = speech.SpeechClient()\n\n response = speech_client.recognize(config=config, audio=audio)\n\n for result in response.results:\n print(f\"Transcript: {result.alternatives[0].transcript}\")\n\n\u003cbr /\u003e"]]