Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Gunakan fungsi tolok ukur Konsol Cloud Speech-to-Text untuk mengukur akurasi salah satu model transkripsi yang digunakan di Speech-to-Text V2 API.
Konsol Cloud Speech-to-Text menyediakan tolok ukur visual untuk model Speech-to-Text terlatih dan Kustom. Anda dapat memeriksa kualitas pengenalan dengan membandingkan metrik evaluasi Word-Error-Rate (WER) di beberapa model transkripsi untuk membantu Anda memutuskan model mana yang paling sesuai dengan aplikasi Anda.
Sebelum memulai
Pastikan Anda telah mendaftar ke Google Cloud akun, membuat project, melatih model ucapan kustom, dan men-deploy menggunakan endpoint.
Membuat set data kebenaran dasar
Untuk membuat set data tolok ukur kustom, kumpulkan sampel audio yang secara akurat mencerminkan jenis traffic yang akan dihadapi model transkripsi di lingkungan produksi. Durasi gabungan file audio ini idealnya minimal 30 menit dan tidak lebih dari 10 jam. Untuk menyusun set data, Anda perlu:
Buat direktori di bucket Cloud Storage pilihan Anda untuk menyimpan file audio dan teks untuk set data.
Untuk setiap file audio dalam set data, buat transkripsi yang cukup akurat. Untuk setiap file audio (seperti example_audio_1.wav), file teks kebenaran dasar (example_audio_1.txt) yang sesuai harus dibuat. Layanan ini menggunakan pasangan audio-teks ini dalam bucket Cloud Storage untuk menyusun set data.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-08-18 UTC."],[],[],null,["# Evaluate models\n\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nUse the benchmarking functionality of the Cloud Speech-to-Text Console to measure the accuracy of any of the [transcription models](/speech-to-text/v2/docs/transcription-model) used in the Speech-to-Text V2 API.\n\nCloud Speech-to-Text Console provides visual benchmarking for pre-trained and Custom Speech-to-Text models. You can inspect the recognition quality by comparing Word-Error-Rate (WER) evaluation metrics across multiple transcription models to help you decide which model best fits your application.\n\nBefore you begin\n----------------\n\nEnsure you have signed up for a Google Cloud account, created a project, trained a custom speech model, and deployed using an endpoint.\n\nCreate a ground-truth dataset\n-----------------------------\n\nTo create a custom benchmarking dataset, gather audio samples that accurately reflect the type of traffic the transcription model will encounter in a production environment. The aggregate duration of these audio files should ideally span a minimum of 30 minutes and not exceed 10 hours. To assemble the dataset, you will need to:\n\n1. Create a directory in a Cloud Storage bucket of your choice to store the audio and text files for the dataset.\n2. For every audio-file in the dataset, create reasonably accurate transcriptions. For each audio file (such as `example_audio_1.wav`), a corresponding ground-truth text file (`example_audio_1.txt`) must be created. This service uses these audio-text pairings in a Cloud Storage bucket to assemble the dataset.\n\nBenchmark the model\n-------------------\n\nUsing the Custom Speech-to-Text model and your benchmarking dataset to assess the accuracy of your model, follow the [Measure and improve accuracy guide](/speech-to-text/docs/measure-accuracy)."]]