Mit Sammlungen den Überblick behalten
Sie können Inhalte basierend auf Ihren Einstellungen speichern und kategorisieren.
Auf dieser Seite wird beschrieben, wie Sie einen Dialog mit mehreren Sprechern erstellen, der mit Text-to-Speech generiert wird.
Sie können Audio mit mehreren Sprechern generieren, um einen Dialog zu erstellen. Das kann für Interviews, interaktives Storytelling, Videospiele, E-Learning-Plattformen und Barrierefreiheitslösungen nützlich sein.
Die folgende Stimme wird für Audio mit mehreren Sprechern unterstützt:
en-US-Studio-Multispeaker
Sprecher: R
Sprecher: S
Sprecher: T
Sprecher: U
Beispiel: Dieses Beispiel enthält Audioinhalte, die mit mehreren Sprechern generiert wurden.
Beispiel für die Verwendung von Markup für mehrere Sprecher
Dieses Beispiel zeigt, wie Sie die Markup-Funktion für mehrere Sprecher verwenden.
"""Synthesizes speech for multiple speakers.Make sure to be working in a virtual environment."""fromgoogle.cloudimporttexttospeech_v1beta1astexttospeech# Instantiates a clientclient=texttospeech.TextToSpeechClient()multi_speaker_markup=texttospeech.MultiSpeakerMarkup(turns=[texttospeech.MultiSpeakerMarkup.Turn(text="I've heard that the Google Cloud multi-speaker audio generation sounds amazing!",speaker="R",),texttospeech.MultiSpeakerMarkup.Turn(text="Oh? What's so good about it?",speaker="S"),texttospeech.MultiSpeakerMarkup.Turn(text="Well..",speaker="R"),texttospeech.MultiSpeakerMarkup.Turn(text="Well what?",speaker="S"),texttospeech.MultiSpeakerMarkup.Turn(text="Well, you should find it out by yourself!",speaker="R"),texttospeech.MultiSpeakerMarkup.Turn(text="Alright alright, let's try it out!",speaker="S"),])# Set the text input to be synthesizedsynthesis_input=texttospeech.SynthesisInput(multi_speaker_markup=multi_speaker_markup)# Build the voice request, select the language code ('en-US') and the voicevoice=texttospeech.VoiceSelectionParams(language_code="en-US",name="en-US-Studio-MultiSpeaker")# Select the type of audio file you want returnedaudio_config=texttospeech.AudioConfig(audio_encoding=texttospeech.AudioEncoding.MP3)# Perform the text-to-speech request on the text input with the selected# voice parameters and audio file typeresponse=client.synthesize_speech(input=synthesis_input,voice=voice,audio_config=audio_config)# The response's audio_content is binary.withopen("output.mp3","wb")asout:# Write the response to the output file.out.write(response.audio_content)print('Audio content written to file "output.mp3"')
[[["Leicht verständlich","easyToUnderstand","thumb-up"],["Mein Problem wurde gelöst","solvedMyProblem","thumb-up"],["Sonstiges","otherUp","thumb-up"]],[["Schwer verständlich","hardToUnderstand","thumb-down"],["Informationen oder Beispielcode falsch","incorrectInformationOrSampleCode","thumb-down"],["Benötigte Informationen/Beispiele nicht gefunden","missingTheInformationSamplesINeed","thumb-down"],["Problem mit der Übersetzung","translationIssue","thumb-down"],["Sonstiges","otherDown","thumb-down"]],["Zuletzt aktualisiert: 2025-09-04 (UTC)."],[],[],null,["# Generate dialogue with multiple speakers\n\n| **Note:** This feature is only available to projects in allowlist. Please contact us if you want to use this feature.\n\nThis page describes how to create a dialogue with multiple speakers\ncreated by Text-to-Speech.\n\nYou can generate audio with multiple speakers to create a dialogue. This can be\nuseful for interviews, interactive storytelling, video games,\ne-learning platforms, and accessibility solutions.\n\nThe following voice is supported for audio with multiple speakers:\n\n- `en-US-Studio-Multispeaker`\n - speaker: `R`\n - speaker: `S`\n - speaker: `T`\n - speaker: `U`\n\nYour browser does not support the audio element. \n\n*Example. This sample is audio that was generated using multiple speakers.* \n| **Note:** This feature supports a maximum of two speakers. It's an experimental offering and not in the [list of voices](/text-to-speech/docs/list-voices).\n\nExample of how to use multi-speaker markup\n------------------------------------------\n\nThis is an example that demonstrates how to use multi-speaker markup. \n\n### Python\n\n\nTo learn how to install and use the client library for Text-to-Speech, see\n[Text-to-Speech client libraries](/text-to-speech/docs/libraries).\n\n\nFor more information, see the\n[Text-to-Speech Python API\nreference documentation](/python/docs/reference/texttospeech/latest).\n\n\nTo authenticate to Text-to-Speech, set up Application Default Credentials.\nFor more information, see\n\n[Set up authentication for a local development environment](/docs/authentication/set-up-adc-local-dev-environment).\n\n \"\"\"Synthesizes speech for multiple speakers.\n Make sure to be working in a virtual environment.\n \"\"\"\n from google.cloud import texttospeech_v1beta1 as texttospeech\n\n # Instantiates a client\n client = texttospeech.TextToSpeechClient()\n\n multi_speaker_markup = texttospeech.MultiSpeakerMarkup(\n turns=[\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"I've heard that the Google Cloud multi-speaker audio generation sounds amazing!\",\n speaker=\"R\",\n ),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Oh? What's so good about it?\", speaker=\"S\"\n ),\n texttospeech.MultiSpeakerMarkup.Turn(text=\"Well..\", speaker=\"R\"),\n texttospeech.MultiSpeakerMarkup.Turn(text=\"Well what?\", speaker=\"S\"),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Well, you should find it out by yourself!\", speaker=\"R\"\n ),\n texttospeech.MultiSpeakerMarkup.Turn(\n text=\"Alright alright, let's try it out!\", speaker=\"S\"\n ),\n ]\n )\n\n # Set the text input to be synthesized\n synthesis_input = texttospeech.SynthesisInput(\n multi_speaker_markup=multi_speaker_markup\n )\n\n # Build the voice request, select the language code ('en-US') and the voice\n voice = texttospeech.VoiceSelectionParams(\n language_code=\"en-US\", name=\"en-US-Studio-MultiSpeaker\"\n )\n\n # Select the type of audio file you want returned\n audio_config = texttospeech.AudioConfig(\n audio_encoding=texttospeech.AudioEncoding.MP3\n )\n\n # Perform the text-to-speech request on the text input with the selected\n # voice parameters and audio file type\n response = client.synthesize_speech(\n input=synthesis_input, voice=voice, audio_config=audio_config\n )\n\n # The response's audio_content is binary.\n with open(\"output.mp3\", \"wb\") as out:\n # Write the response to the output file.\n out.write(response.audio_content)\n print('Audio content written to file \"output.mp3\"')\n\n\u003cbr /\u003e"]]