Mantenha tudo organizado com as coleções
Salve e categorize o conteúdo com base nas suas preferências.
É possível especificar que a Speech-to-Text indique um valor de acurácia
ou nível de confiança para
palavras individuais em uma transcrição.
Nível de confiança por palavra
Ao transcrever um clipe de áudio, a Speech-to-Text
também mede o grau de acurácia da resposta. O nível de confiança
da solicitação de transcrição completa é indicado como um número entre 0,0 e 1,0
na resposta enviada pela Speech-to-Text.
Confira o exemplo de código abaixo com o nível de confiança
retornado pela Speech-to-Text.
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.96748614
}
]
}
]
}
Além do nível de confiança da transcrição completa,
a Speech-to-Text também pode indicar o nível de confiança de
palavras individuais da transcrição. Nesse caso, a resposta
inclui os detalhes WordInfo na transcrição,
indicando o nível de confiança de palavras individuais, conforme mostrado no
exemplo a seguir.
{
"results": [
{
"alternatives": [
{
"transcript": "how old is the Brooklyn Bridge",
"confidence": 0.98360395,
"words": [
{
"startOffset": "0s",
"endOffset": "0.300s",
"word": "how",
"confidence": SOME NUMBER
},
...
]
}
]
}
]
}
Como ativar o nível de confiança por palavra em uma solicitação
O snippet de código a seguir demonstra como ativar o nível de confiança por palavra
em uma solicitação de transcrição para a Speech-to-Text usando arquivos locais e remotos.
Usar um arquivo local
Protocolo
Consulte o endpoint de API speech:recognize
para conferir todos os detalhes.
Para realizar o reconhecimento de fala síncrono, faça uma solicitação POST e forneça o
corpo de solicitação apropriado. Confira a seguir um exemplo de uma solicitação POST que usa
curl. O exemplo usa a CLI do Google Cloud para gerar um token
de acesso. Para saber como instalar a gcloud CLI,
consulte o guia de início rápido.
O exemplo a seguir mostra como enviar uma solicitação POST usando curl,
em que o corpo da solicitação ativa o nível de confiança por palavra.
Quando a solicitação é feita corretamente, o servidor retorna um código de status HTTP 200 OK
e a resposta no formato JSON, e os salva em um arquivo
chamado word-level-confidence.txt.
fromgoogle.cloudimportspeech_v1p1beta1asspeechclient=speech.SpeechClient()speech_file="resources/Google_Gnome.wav"withopen(speech_file,"rb")asaudio_file:content=audio_file.read()audio=speech.RecognitionAudio(content=content)config=speech.RecognitionConfig(encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,sample_rate_hertz=16000,language_code="en-US",enable_word_confidence=True,)response=client.recognize(config=config,audio=audio)fori,resultinenumerate(response.results):alternative=result.alternatives[0]print("-"*20)print(f"First alternative of result {i}")print(f"Transcript: {alternative.transcript}")print("First Word and Confidence: ({}, {})".format(alternative.words[0].word,alternative.words[0].confidence))returnresponse.results
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2025-08-28 UTC."],[],[],null,["# Enable word-level confidence\n\n| **Preview**\n|\n|\n| This feature is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| Pre-GA features are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\nYou can specify that Speech-to-Text indicate a value of accuracy,\nor [confidence level](/speech-to-text/v2/docs/basics#confidence-values), for\nindividual words in a transcription.\n\nWord-level confidence\n---------------------\n\nWhen the Speech-to-Text transcribes an audio clip, it also\nmeasures the degree of accuracy for the response. The response\nsent from Speech-to-Text states the confidence level for\nthe entire transcription request as a number between 0.0 and 1.0.\nThe following code sample shows an example of the confidence level\nvalue returned by Speech-to-Text. \n\n```\n{\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"how old is the Brooklyn Bridge\",\n \"confidence\": 0.96748614\n }\n ]\n }\n ]\n}\n```\n\nIn addition to the confidence level of the entire transcription,\nSpeech-to-Text can also provide the confidence level of\nindividual words within the transcription. The response then\nincludes [`WordInfo`](/speech-to-text/v2/docs/reference/rest/v2/projects.locations.recognizers/recognize#wordinfo) details in the transcription,\nindicating the confidence level for individual words as shown in the\nfollowing example. \n\n```\n{\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"how old is the Brooklyn Bridge\",\n \"confidence\": 0.98360395,\n \"words\": [\n {\n \"startOffset\": \"0s\",\n \"endOffset\": \"0.300s\",\n \"word\": \"how\",\n \"confidence\": SOME NUMBER\n },\n ...\n ]\n }\n ]\n }\n ]\n}\n```\n\nEnable word-level confidence in a request\n-----------------------------------------\n\nThe following code snippet demonstrates how to enable word-level\nconfidence in a transcription request to Speech-to-Text using local and remote files.\n\n### Use a local file\n\n### Protocol\n\nRefer to the [`speech:recognize`](/speech-to-text/v2/docs/reference/rest/v2/projects.locations.recognizers/recognize)\nAPI endpoint for complete details.\n\n\nTo perform synchronous speech recognition, make a `POST` request and provide the\nappropriate request body. The following shows an example of a `POST` request using\n`curl`. The example uses the [Google Cloud CLI](/sdk) to generate an access\ntoken. For instructions on installing the gcloud CLI,\nsee the [quickstart](/speech-to-text/docs/transcribe-api).\n\nThe following example show how to send a `POST` request using `curl`,\nwhere the body of the request enables word-level confidence. \n\n```bash\ncurl -s -H \"Content-Type: application/json\" \\\n -H \"Authorization: Bearer $(gcloud auth application-default print-access-token)\" \\\n https://speech.googleapis.com/v2/projects/{project}/locations/global/recognizers/{recognizers}:recognize \\\n --data '{\n \"config\": {\n \"features\": {\n \"enableWordTimeOffsets\": true,\n \"enableWordConfidence\": true\n }\n },\n \"uri\": \"gs://cloud-samples-tests/speech/brooklyn.flac\"\n}' \u003e word-level-confidence.txt\n```\n\nIf the request is successful, the server returns a `200 OK` HTTP\nstatus code and the response in JSON format, saved to a file\nnamed `word-level-confidence.txt`. \n\n```\n{\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"how old is the Brooklyn Bridge\",\n \"confidence\": 0.98360395,\n \"words\": [\n {\n \"startTime\": \"0s\",\n \"endTime\": \"0.300s\",\n \"word\": \"how\",\n \"confidence\": 0.98762906\n },\n {\n \"startTime\": \"0.300s\",\n \"endTime\": \"0.600s\",\n \"word\": \"old\",\n \"confidence\": 0.96929157\n },\n {\n \"startTime\": \"0.600s\",\n \"endTime\": \"0.800s\",\n \"word\": \"is\",\n \"confidence\": 0.98271006\n },\n {\n \"startTime\": \"0.800s\",\n \"endTime\": \"0.900s\",\n \"word\": \"the\",\n \"confidence\": 0.98271006\n },\n {\n \"startTime\": \"0.900s\",\n \"endTime\": \"1.100s\",\n \"word\": \"Brooklyn\",\n \"confidence\": 0.98762906\n },\n {\n \"startTime\": \"1.100s\",\n \"endTime\": \"1.500s\",\n \"word\": \"Bridge\",\n \"confidence\": 0.98762906\n }\n ]\n }\n ],\n \"languageCode\": \"en-us\"\n }\n ]\n}\n```\n\n### Python\n\n\nTo learn how to install and use the client library for Speech-to-Text, see\n[Speech-to-Text client libraries](/speech-to-text/docs/client-libraries).\n\n\nFor more information, see the\n[Speech-to-Text Python API\nreference documentation](/python/docs/reference/speech/latest).\n\n\nTo authenticate to Speech-to-Text, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n from google.cloud import speech_v1p1beta1 as speech\n\n client = speech.SpeechClient()\n\n speech_file = \"resources/Google_Gnome.wav\"\n\n with open(speech_file, \"rb\") as audio_file:\n content = audio_file.read()\n\n audio = speech.RecognitionAudio(content=content)\n\n config = speech.RecognitionConfig(\n encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,\n sample_rate_hertz=16000,\n language_code=\"en-US\",\n enable_word_confidence=True,\n )\n\n response = client.recognize(config=config, audio=audio)\n\n for i, result in enumerate(response.results):\n alternative = result.alternatives[0]\n print(\"-\" * 20)\n print(f\"First alternative of result {i}\")\n print(f\"Transcript: {alternative.transcript}\")\n print(\n \"First Word and Confidence: ({}, {})\".format(\n alternative.words[0].word, alternative.words[0].confidence\n )\n )\n\n return response.results\n\n\u003cbr /\u003e"]]