Tetap teratur dengan koleksi
Simpan dan kategorikan konten berdasarkan preferensi Anda.
Speech-to-Text menawarkan dua model medis selain
model pengenalan ucapan standar dan ditingkatkan lainnya.
Model medis secara khusus disesuaikan untuk pengenalan kata-kata yang umum dalam lingkungan medis, seperti diagnosis, pengobatan, gejala, perawatan, dan kondisi. Jika Anda ingin mengenali jenis data audio ini, Anda dapat meningkatkan hasil transkripsi dengan menggunakan model ini.
Ada dua model medis, masing-masing disesuaikan dengan kasus penggunaan tertentu:
medical_conversation: untuk percakapan antara penyedia
perawatan medis—misalnya, dokter atau perawat—dan pasien. Gunakan model ini
saat penyedia perawatan medis dan pasien sedang berbicara. Kata-kata
yang diucapkan oleh setiap pembicara akan otomatis dideteksi dan diberi label dalam
transkripsi yang ditampilkan.
medical_dictation: untuk catatan dikte yang diucapkan oleh satu penyedia
perawatan medis—misalnya, dokter yang mendikte catatan tentang hasil
tes darah pasien.
Gunakan model medis hanya dengan fitur Speech-to-Text berikut. Fitur yang dihilangkan dari daftar ini tidak dapat digunakan dengan model medis mana pun.
Model medical conversation mendukung fitur berikut:
Anda akan menerima respons JSON yang mirip dengan yang berikut ini:
"results": [
{
"alternatives": [
{
"transcript": "Um-hum . Yeah. Hello , good morning . Good
morning . So , tell me what's going on . Uh , sure , so , um , I
woke up probably three or four days ago , which , uh , wheezing and short of breath .
Okay , any cough or chest pain ? I cough infrequently , but no ,
uh , chest pain . Have you been exposed to anyone with covid ?
Uh , no , and I also took a test , which was negative . Uh , is it getting
worse , or better ? Uh , it has been getting a lot worse"
}
]
},
{
"alternatives": [
{
"transcript": "Okay . Was there something that triggered this exposure to cold , for
example ? Um , I had a gone hiking , and I got caught in the rain the day
before this all started ."
}
]
}
]
}
Tanda baca lisan
Model medical dictation mendukung tanda baca lisan untuk catatan medis. Fitur ini selalu diaktifkan. Tanda baca lisan ditunjukkan dengan tanda kurung siku dalam transkripsi ucapan. Misalnya, transkripsi yang ditampilkan mungkin terlihat seperti berikut:
Patient could be showing signs of trauma [question mark] They said they were [quote] having elevated heart rate [unquote].
Speech-to-Text mendukung tanda baca lisan berikut:
titik
koma
titik dua
huruf kapital
garis miring
tanda pisah
tanda hubung
tanda tanya
titik koma
tanda petik
tanda petik tutup
tanda kutip akhir
kurung buka
kurung tutup
kurung akhir
Perintah pemformatan
Model medical dictation mendukung perintah lisan untuk memformat catatan. Fitur ini selalu diaktifkan. Perintah lisan ditunjukkan dengan tanda kurung siku dalam transkripsi ucapan. Misalnya, transkripsi yang ditampilkan mungkin terlihat seperti berikut:
[next line] Patient says they are experiencing fever [next point].
Speech-to-Text mendukung perintah lisan berikut:
poin berikutnya
nomor berikutnya
paragraf berikutnya
huruf kapital
kapitalisasi
baris baru
item berikutnya
soal berikutnya
nomor soal berikutnya
baris berikutnya
bagian berikutnya
nomor berikutnya
ulangi
ulangi dari awal
akhiri dikte
Judul lisan
Model dikte medis mendukung judul lisan untuk catatan yang didiktekan. Fitur
ini diaktifkan secara default, dan tidak dapat dinonaktifkan. Judul akan ditandai dengan tanda kurung dalam transkripsi dan akan menggunakan huruf besar. Misalnya, transkripsi yang ditampilkan mungkin terlihat seperti berikut:
[CURRENT MEDICATIONS] Patient is currently taking no medications.
[[["Mudah dipahami","easyToUnderstand","thumb-up"],["Memecahkan masalah saya","solvedMyProblem","thumb-up"],["Lainnya","otherUp","thumb-up"]],[["Sulit dipahami","hardToUnderstand","thumb-down"],["Informasi atau kode contoh salah","incorrectInformationOrSampleCode","thumb-down"],["Informasi/contoh yang saya butuhkan tidak ada","missingTheInformationSamplesINeed","thumb-down"],["Masalah terjemahan","translationIssue","thumb-down"],["Lainnya","otherDown","thumb-down"]],["Terakhir diperbarui pada 2025-09-04 UTC."],[],[],null,["# Recognize speech by using medical models\n\nSpeech-to-Text offers two *medical* models in addition the other\n[standard and enhanced speech recognition models](/speech-to-text/v2/docs/transcription-model).\nThe medical models are specifically tailored for recognition of words that\nare common in medical settings, such as diagnoses, medications, symptoms,\ntreatments, and conditions. If you want to recognize this type of audio data,\nyou can improve your transcription results by using these models.\n\nThere are two medical models, each tailored to specific use cases:\n\n- `medical_conversation`: for conversations between a medical provider---for example, a doctor or nurse---and a patient. Use this model when both a provider and a patient are speaking. Words uttered by each speaker are automatically detected and labeled in the returned transcript.\n- `medical_dictation`: for dictated notes spoken by a single medical provider---for example, a doctor dictating notes about a patient's blood test results.\n\nUse medical models only with the following Speech-to-Text features. Features\nomitted from this list can't be used with either medical model.\n\nThe medical conversation model supports the following features:\n\n- [Speaker diarization](/speech-to-text/v2/docs/multiple-voices)\n- [Alternate transcriptions](/speech-to-text/v2/docs/basics#selecting_alternatives)\n- [Word timestamps](/speech-to-text/docs/async-time-offsets)\n\nand requires that the following features be enabled:\n\n- [Automatic punctuation](/speech-to-text/v2/docs/automatic-punctuation)\n\nThe medical dictation model supports the following features:\n\n- [Alternate transcriptions](/speech-to-text/v2/docs/basics#selecting_alternatives)\n- [Word timestamps](/speech-to-text/docs/async-time-offsets)\n- [Formatting Commands](#formatting_commands)\n- [Spoken Headings](#spoken_headings)\n\nand requires that the following features be enabled:\n\n- [Automatic punctuation](/speech-to-text/docs/automatic-punctuation)\n- [Spoken Punctuation](#spoken_punctuation)\n\nSend a transcription request\n----------------------------\n\n### REST\n\nThe following code sample uses the `medical_conversation` model to transcribe\nan audio file in a public Cloud Storage bucket.\n\n\nBefore using any of the request data,\nmake the following replacements:\n\n- \u003cvar translate=\"no\"\u003eLANGUAGE_CODE\u003c/var\u003e: the BCP-47 code of the language spoken in your audio clip. Medical models are only available for *en-US*.\n- \u003cvar translate=\"no\"\u003eENCODING\u003c/var\u003e: the encoding of the audio you want to transcribe. If you are using the public audio sample, the encoding is `LINEAR16`.\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the alphanumeric ID of your Google Cloud project.\n\n\nHTTP method and URL:\n\n```\nPOST https://speech.googleapis.com/v1/speech:recognize\n```\n\n\nRequest JSON body:\n\n```\n{\n \"config\": {\n \"languageCode\": \"LANGUAGE_CODE\",\n \"encoding\": \"ENCODING\",\n \"model\": \"medical_conversation\"\n },\n \"audio\": {\n \"uri\": \"gs://cloud-samples-data/speech/medical_conversation_2.wav\"\n }\n}\n```\n\nTo send your request, expand one of these options:\n\n#### curl (Linux, macOS, or Cloud Shell)\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) , or by using [Cloud Shell](/shell/docs), which automatically logs you into the `gcloud` CLI . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\ncurl -X POST \\\n -H \"Authorization: Bearer $(gcloud auth print-access-token)\" \\\n -H \"x-goog-user-project: PROJECT_ID\" \\\n -H \"Content-Type: application/json; charset=utf-8\" \\\n -d @request.json \\\n \"https://speech.googleapis.com/v1/speech:recognize\"\n```\n\n#### PowerShell (Windows)\n\n| **Note:** The following command assumes that you have logged in to the `gcloud` CLI with your user account by running [`gcloud init`](/sdk/gcloud/reference/init) or [`gcloud auth login`](/sdk/gcloud/reference/auth/login) . You can check the currently active account by running [`gcloud auth list`](/sdk/gcloud/reference/auth/list).\n\n\nSave the request body in a file named `request.json`,\nand execute the following command:\n\n```\n$cred = gcloud auth print-access-token\n$headers = @{ \"Authorization\" = \"Bearer $cred\"; \"x-goog-user-project\" = \"PROJECT_ID\" }\n\nInvoke-WebRequest `\n -Method POST `\n -Headers $headers `\n -ContentType: \"application/json; charset=utf-8\" `\n -InFile request.json `\n -Uri \"https://speech.googleapis.com/v1/speech:recognize\" | Select-Object -Expand Content\n```\n\nYou should receive a JSON response similar to the following:\n\n```\n \"results\": [\n {\n \"alternatives\": [\n {\n \"transcript\": \"Um-hum . Yeah. Hello , good morning . Good\n morning . So , tell me what's going on . Uh , sure , so , um , I\n woke up probably three or four days ago , which , uh , wheezing and short of breath .\n Okay , any cough or chest pain ? I cough infrequently , but no ,\n uh , chest pain . Have you been exposed to anyone with covid ?\n Uh , no , and I also took a test , which was negative . Uh , is it getting\n worse , or better ? Uh , it has been getting a lot worse\"\n }\n ]\n },\n {\n \"alternatives\": [\n {\n \"transcript\": \"Okay . Was there something that triggered this exposure to cold , for\n example ? Um , I had a gone hiking , and I got caught in the rain the day\n before this all started .\"\n }\n ]\n }\n ]\n}\n```\n\n\u003cbr /\u003e\n\nSpoken punctuation\n------------------\n\nThe medical dictation model supports spoken punctuation for medical notes. This\nfeature is always enabled. Spoken punctuation is\ndelineated by brackets in the speech transcription. For example, your returned\ntranscription might look similar to the following:\n\n`Patient could be showing signs of trauma [question mark] They said they were [quote] having elevated heart rate [unquote]`.\n\nSpeech-to-Text supports the following spoken punctuation:\n\n- period\n- comma\n- colon\n- caps\n- slash\n- dash\n- hyphen\n- question mark\n- semicolon\n- quote\n- unquote\n- end quote\n- open parenthesis\n- close parenthesis\n- end parenthesis\n\nFormatting commands\n-------------------\n\nThe medical dictation model supports spoken commands for formatting notes. This\nfeature is always enabled. The spoken commands will\nbe delineated by brackets in the speech transcription. For example, your\nreturned transcription might look similar to the following:\n\n`[next line] Patient says they are experiencing fever [next point]`.\n\nSpeech-to-Text supports the following spoken commands:\n\n- next point\n- next number\n- next paragraph\n- caps\n- capitalization\n- new line\n- next item\n- next problem\n- next problem number\n- next row\n- next section\n- number next\n- scratch\n- scratch that\n- end dictation\n\nSpoken headings\n---------------\n\nThe medical dictation model supports spoken headings for dictated notes. This\nfeature is enabled by default, and cannot be disabled. The headings will be\ndelineated by brackets in the transcription and will be capitalized. For\nexample, your returned transcription might look similar to the following:\n\n`[CURRENT MEDICATIONS] Patient is currently taking no medications`.\n\nSpeech-to-Text supports the following spoken headings:\n\n- CHIEF COMPLAINT\n- CURRENT MEDICATIONS\n- DISCHARGE MEDICATIONS\n- DISCHARGE PLAN\n- FAMILY HISTORY\n- FINDINGS\n- REVIEW OF SYSTEMS\n- HISTORY OF PRESENT ILLNESS\n- INDICATIONS\n- LABS\n- PAST SURGICAL HISTORY\n- PHYSICAL EXAM\n- REVIEW OF SYSTEMS\n- RADIOLOGY"]]