Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Anda dapat menentukan bahwa Speech-to-Text menunjukkan nilai akurasi, atau tingkat keyakinan, untuk setiap kata dalam transkripsi.
Keyakinan tingkat kata
Saat mentranskripsikan klip audio, Speech-to-Text juga mengukur tingkat akurasi respons. Respons yang dikirim dari Speech-to-Text menyatakan tingkat keyakinan untuk seluruh permintaan transkripsi sebagai angka antara 0,0 dan 1,0.
Contoh kode berikut menunjukkan nilai tingkat keyakinan yang ditampilkan oleh Speech-to-Text.
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.96748614
}
]
}
]
}
Selain tingkat keyakinan seluruh transkripsi, Speech-to-Text juga dapat memberikan tingkat keyakinan setiap kata dalam transkripsi. Respons ini kemudian
menyertakan detail WordInfo dalam transkripsi, yang menunjukkan tingkat keyakinan untuk setiap kata seperti yang ditunjukkan dalam
contoh berikut.
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.98360395,
"words": [
{
"startOffset": "0s",
"endOffset": "0.300s",
"word": "how",
"confidence": SOME NUMBER
},
...
]
}
]
}
]
}
Mengaktifkan keyakinan tingkat kata dalam permintaan
Cuplikan kode berikut menunjukkan cara mengaktifkan keyakinan tingkat kata
dalam permintaan transkripsi ke Speech-to-Text menggunakan file lokal dan jarak jauh.
Menggunakan file lokal
Protocol
Lihat endpoint API speech:recognize
untuk mengetahui detail selengkapnya.
Untuk melakukan pengenalan ucapan sinkron, buat permintaan POST dan berikan
isi permintaan yang sesuai. Berikut ini contoh permintaan POST yang menggunakan
curl. Contoh ini menggunakan Google Cloud CLI untuk membuat token akses. Untuk mengetahui petunjuk tentang cara menginstal gcloud CLI,
lihat panduan memulai.
Contoh berikut menunjukkan cara mengirim permintaan POST menggunakan curl, di mana isi permintaan mengaktifkan keyakinan tingkat kata.
Jika permintaan berhasil, server akan menampilkan kode status HTTP 200 OK dan respons dalam format JSON, yang disimpan ke file bernama word-level-confidence.txt.
fromgoogle.cloudimportspeech_v1p1beta1asspeechclient=speech.SpeechClient()speech_file="resources/Google_Gnome.wav"withopen(speech_file,"rb")asaudio_file:content=audio_file.read()audio=speech.RecognitionAudio(content=content)config=speech.RecognitionConfig(encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,sample_rate_hertz=16000,language_code="en-US",enable_word_confidence=True,)response=client.recognize(config=config,audio=audio)fori,resultinenumerate(response.results):alternative=result.alternatives[0]print("-"*20)print(f"First alternative of result {i}")print(f"Transcript: {alternative.transcript}")print("First Word and Confidence: ({}, {})".format(alternative.words[0].word,alternative.words[0].confidence))returnresponse.results
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-02 UTC."],[],[],null,["# Enable word-level confidence\n\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nYou can specify that Speech-to-Text indicate a value of accuracy,\nor [confidence level](/speech-to-text/v2/docs/basics#confidence-values), for\nindividual words in a transcription.\n\nWord-level confidence\n---------------------\n\nWhen the Speech-to-Text transcribes an audio clip, it also\nmeasures the degree of accuracy for the response. The response\nsent from Speech-to-Text states the confidence level for\nthe entire transcription request as a number between 0.0 and 1.0.\nThe following code sample shows an example of the confidence level\nvalue returned by Speech-to-Text. \n\n```\n{\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"how old is the Brooklyn Bridge\",\n \"confidence\": 0.96748614\n }\n ]\n }\n ]\n}\n```\n\nIn addition to the confidence level of the entire transcription,\nSpeech-to-Text can also provide the confidence level of\nindividual words within the transcription. The response then\nincludes [`WordInfo`](/speech-to-text/v2/docs/reference/rest/v2/projects.locations.recognizers/recognize#wordinfo) details in the transcription,\nindicating the confidence level for individual words as shown in the\nfollowing example. \n\n```\n{\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"how old is the Brooklyn Bridge\",\n \"confidence\": 0.98360395,\n \"words\": [\n {\n \"startOffset\": \"0s\",\n \"endOffset\": \"0.300s\",\n \"word\": \"how\",\n \"confidence\": SOME NUMBER\n },\n ...\n ]\n }\n ]\n }\n ]\n}\n```\n\nEnable word-level confidence in a request\n-----------------------------------------\n\nThe following code snippet demonstrates how to enable word-level\nconfidence in a transcription request to Speech-to-Text using local and remote files.\n\n### Use a local file\n\n### Protocol\n\nRefer to the [`speech:recognize`](/speech-to-text/v2/docs/reference/rest/v2/projects.locations.recognizers/recognize)\nAPI endpoint for complete details.\n\n\nTo perform synchronous speech recognition, make a `POST` request and provide the\nappropriate request body. The following shows an example of a `POST` request using\n`curl`. The example uses the [Google Cloud CLI](/sdk) to generate an access\ntoken. For instructions on installing the gcloud CLI,\nsee the [quickstart](/speech-to-text/docs/transcribe-api).\n\nThe following example show how to send a `POST` request using `curl`,\nwhere the body of the request enables word-level confidence. \n\n```bash\ncurl -s -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer $(gcloud auth application-default print-access-token)\" \\\n https://speech.googleapis.com/v2/projects/{project}/locations/global/recognizers/{recognizers}:recognize \\\n --data '{\n \"config\": {\n \"features\": {\n \"enableWordTimeOffsets\": true,\n \"enableWordConfidence\": true\n }\n },\n \"uri\": \"gs://cloud-samples-tests/speech/brooklyn.flac\"\n}' \u003e word-level-confidence.txt\n```\n\nIf the request is successful, the server returns a `200 OK` HTTP\nstatus code and the response in JSON format, saved to a file\nnamed `word-level-confidence.txt`. \n\n```\n{\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"how old is the Brooklyn Bridge\",\n \"confidence\": 0.98360395,\n \"words\": [\n {\n \"startTime\": \"0s\",\n \"endTime\": \"0.300s\",\n \"word\": \"how\",\n \"confidence\": 0.98762906\n },\n {\n \"startTime\": \"0.300s\",\n \"endTime\": \"0.600s\",\n \"word\": \"old\",\n \"confidence\": 0.96929157\n },\n {\n \"startTime\": \"0.600s\",\n \"endTime\": \"0.800s\",\n \"word\": \"is\",\n \"confidence\": 0.98271006\n },\n {\n \"startTime\": \"0.800s\",\n \"endTime\": \"0.900s\",\n \"word\": \"the\",\n \"confidence\": 0.98271006\n },\n {\n \"startTime\": \"0.900s\",\n \"endTime\": \"1.100s\",\n \"word\": \"Brooklyn\",\n \"confidence\": 0.98762906\n },\n {\n \"startTime\": \"1.100s\",\n \"endTime\": \"1.500s\",\n \"word\": \"Bridge\",\n \"confidence\": 0.98762906\n }\n ]\n }\n ],\n \"languageCode\": \"en-us\"\n }\n ]\n}\n```\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google.cloud import speech_v1p1beta1 as speech\n\n client = speech.SpeechClient()\n\n speech_file = \"resources/Google_Gnome.wav\"\n\n with open(speech_file, \"rb\") as audio_file:\n content = audio_file.read()\n\n audio = speech.RecognitionAudio(content=content)\n\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=16000,\n language_code=\"en-US\",\n enable_word_confidence=True,\n )\n\n response = client.recognize(config=config, audio=audio)\n\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n print(\"-\" * 20)\n print(f\"First alternative of result {i}\")\n print(f\"Transcript: {alternative.transcript}\")\n print(\n \"First Word and Confidence: ({}, {})\".format(\n alternative.words[0].word, alternative.words[0].confidence\n )\n )\n\n return response.results\n\n\u003cbr /\u003e"]]