Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Sprache mit bidirektionalem Streaming synthetisieren
In diesem Dokument wird beschrieben, wie Sie Audioinhalte mithilfe von bidirektionalem Streaming synthetisieren.
Mit bidirektionalem Streaming können Sie gleichzeitig Texteingaben senden und Audiodaten empfangen. Das bedeutet, dass Sie mit der Sprachsynthese beginnen können, bevor der vollständige Eingabetext gesendet wird. Dadurch wird die Latenz verringert und Interaktionen in Echtzeit werden ermöglicht. Sprachassistenten und interaktive Spiele verwenden bidirektionales Streaming, um dynamischere und reaktionsschnellere Anwendungen zu erstellen.
Weitere Informationen zu den grundlegenden Konzepten von Text-to-Speech finden Sie unter Grundlagen von Text-to-Speech.
Hinweis
Bevor Sie eine Anfrage an die Text-to-Speech API senden können, müssen Sie die folgenden Aktionen ausgeführt haben. Weitere Informationen finden Sie auf der Seite Vorbereitung.
Sprache mit bidirektionalem Streaming synthetisieren
Clientbibliothek installieren
Textstream senden und Audiostream empfangen
Die API akzeptiert einen Stream von Anfragen vom Typ StreamingSynthesizeRequest
, die entweder StreamingSynthesisInput
oder StreamingSynthesizeConfig
enthalten.
Bevor Sie einen Stream StreamingSynthesizeRequest
mit StreamingSynthesisInput
senden, der Texteingabe enthält, senden Sie genau einen StreamingSynthesizeRequest
mit einem StreamingSynthesizeConfig
.
Streaming Text-to-Speech ist nur mit Chirp 3: HD-Stimmen kompatibel.
Bereinigen
Löschen Sie das Projekt mit derGoogle Cloud console , wenn Sie es nicht benötigen. Damit vermeiden Sie unnötige Kosten für die Google Cloud Platform.
Sofern nicht anders angegeben, sind die Inhalte dieser Seite unter der Creative Commons Attribution 4.0 License und Codebeispiele unter der Apache 2.0 License lizenziert. Weitere Informationen finden Sie in den Websiterichtlinien von Google Developers. Java ist eine eingetragene Marke von Oracle und/oder seinen Partnern.
Zuletzt aktualisiert: 2025-09-04 (UTC).
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],["Zuletzt aktualisiert: 2025-09-04 (UTC)."],[],[],null,["# Quickstart: Synthesize speech with bidirectional streaming quickstart\n\nSynthesize speech with bidirectional streaming\n==============================================\n\nThis document walks you through the process of synthesizing audio using\nbidirectional streaming.\n\nBidirectional streaming lets you send text input and receive audio data\nsimultaneously. This means that you can start synthesizing speech before the\ncomplete input text is sent, which reduces latency and enables real-time\ninteractions. Voice assistants and interactive games use bidirectional streaming\nto create more dynamic and responsive applications.\n\nTo learn more about the fundamental concepts in Text-to-Speech, read\n[Text-to-Speech Basics](/text-to-speech/docs/basics).\n\nBefore you begin\n----------------\n\nBefore you can send a request to the Text-to-Speech API, you must have completed\nthe following actions. See the\n[before you begin](/text-to-speech/docs/before-you-begin) page for details.\n\n- Enable Text-to-Speech on a Google Cloud project.\n 1. Make sure billing is enabled for Text-to-Speech.\n-\n [Install](/sdk/docs/install) the Google Cloud CLI, and then\n [sign in to the gcloud CLI with your federated identity](/iam/docs/workforce-log-in-gcloud).\n\n After signing in,\n [initialize](/sdk/docs/initializing) the Google Cloud CLI by running the following command:\n\n ```bash\n gcloud init\n ```\n\nSynthesize speech with bidirectional streaming\n----------------------------------------------\n\n### Install the client library\n\n### Python\n\nBefore installing the library, make sure you've [prepared your environment for Python development](/python/docs/setup). \n\n```\npip install --upgrade google-cloud-texttospeech\n```\n\n\u003cbr /\u003e\n\n### Send a stream of text and receive a stream of audio\n\nThe API accepts a stream of requests with type `StreamingSynthesizeRequest`,\nwhich contain either `StreamingSynthesisInput` or `StreamingSynthesizeConfig`.\n\nBefore sending a stream `StreamingSynthesizeRequest` with\n`StreamingSynthesisInput`, which provides text input, send exactly one\n`StreamingSynthesizeRequest` with a `StreamingSynthesizeConfig`.\n\nStreaming Text-to-Speech is only compatible with [Chirp 3: HD voices](/text-to-speech/docs/chirp3-hd). \n\n### Python\n\nBefore running the example, make sure you've [prepared your environment for Python development](/python/docs/setup). \n\n #!/usr/bin/env python\n # Copyright 2024 Google LLC\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # http://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n #\n\n \"\"\"Google Cloud Text-To-Speech API streaming sample application .\n\n Example usage:\n python streaming_tts_quickstart.py\n \"\"\"\n\n\n def run_streaming_tts_quickstart():\n \"\"\"Synthesizes speech from a stream of input text.\"\"\"\n from google.cloud import texttospeech\n\n client = texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.services.text_to_speech.TextToSpeechClient.html()\n\n # See https://cloud.google.com/text-to-speech/docs/voices for all voices.\n streaming_config = texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesizeConfig.html(\n voice=texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.VoiceSelectionParams.html(\n name=\"en-US-Chirp3-HD-Charon\",\n language_code=\"en-US\",\n )\n )\n\n # Set the config for your stream. The first request must contain your config, and then each subsequent request must contain text.\n config_request = texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesizeRequest.html(\n streaming_config=streaming_config\n )\n\n text_iterator = [\n \"Hello there. \",\n \"How are you \",\n \"today? It's \",\n \"such nice weather outside.\",\n ]\n\n # Request generator. Consider using Gemini or another LLM with output streaming as a generator.\n def request_generator():\n yield config_request\n for text in text_iterator:\n yield texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesizeRequest.html(\n input=texttospeech.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.types.StreamingSynthesisInput.html(text=text)\n )\n\n streaming_responses = client.https://cloud.google.com/python/docs/reference/texttospeech/latest/google.cloud.texttospeech_v1.services.text_to_speech.TextToSpeechClient.html#google_cloud_texttospeech_v1_services_text_to_speech_TextToSpeechClient_streaming_synthesize(request_generator())\n\n for response in streaming_responses:\n print(f\"Audio content size in bytes is: {len(response.audio_content)}\")\n\n\n if __name__ == \"__main__\":\n run_streaming_tts_quickstart()\n\n\u003cbr /\u003e\n\nClean up\n--------\n\nTo avoid unnecessary Google Cloud Platform charges, use the\n[Google Cloud console](https://console.cloud.google.com/) to delete your project if you do not need it.\n\nWhat's next\n-----------\n\n\n- Learn more about Cloud Text-to-Speech by reading the [basics](/text-to-speech/docs/basics).\n- Review the list of [available voices](/text-to-speech/docs/voices) you can use for synthetic speech.\n\n\u003cbr /\u003e"]]