Text-to-Speech クライアントライブラリ

このページでは、Text-to-Speech API の Cloud クライアントライブラリの使用を開始する方法を説明します。クライアントライブラリを使用すると、サポートされている言語からGoogle Cloud API に簡単にアクセスできます。サーバーにリクエストを送信してGoogle Cloud API を直接利用することもできますが、クライアントライブラリを使用すると、記述するコードの量を大幅に削減できます。

Cloud クライアントライブラリと以前の Google API クライアントライブラリの詳細については、クライアントライブラリの説明をご覧ください。

クライアントライブラリをインストールする

C++

このクライアントライブラリの要件とインストールの依存関係の詳細については、C++ 開発環境の設定をご覧ください。

C#

Visual Studio 2017 以降を使用している場合は、Nuget パッケージマネージャーのウィンドウを開き、次のように入力します。

Install-Package Google.Apis

.NET Core コマンドラインインターフェースを使用して依存関係をインストールしている場合は、次のコマンドを実行します。

dotnet add package Google.Apis

詳細については、C# 開発環境の設定をご覧ください。

Go

go get cloud.google.com/go/texttospeech/apiv1

詳細については、Go 開発環境の設定をご覧ください。

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.67.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-texttospeech</artifactId>
  </dependency>
</dependencies>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-texttospeech:2.74.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-texttospeech" % "2.74.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

詳細については、Java 開発環境の設定をご覧ください。

Node.js

npm install @google-cloud/text-to-speech

詳細については、Node.js 開発環境の設定をご覧ください。

PHP

composer require google/apiclient

詳細については、Google Cloud での PHP の使用をご覧ください。

Python

pip install --upgrade google-cloud-texttospeech

詳細については、Python 開発環境の設定をご覧ください。

Ruby

gem install google-api-client

詳細については、Ruby 開発環境の設定をご覧ください。

認証を設定する

Google Cloud API の呼び出しを認証するために、クライアントライブラリではアプリケーションのデフォルト認証情報（ADC）がサポートされています。このライブラリは、一連の定義済みロケーションの中から認証情報を探し、それらの認証情報を使用して API へのリクエストを認証します。ADC を使用すると、アプリケーションコードを変更することなく、ローカルでの開発や本番環境など、さまざまな環境のアプリケーションで認証情報を使用できるようになります。

本番環境では、ADC の設定方法はサービスとコンテキストによって異なります。詳細については、アプリケーションのデフォルト認証情報を設定するをご覧ください。

ローカル開発環境では、Google アカウントに関連付けられている認証情報を使用して ADC を設定できます。

Install the Google Cloud CLI, and then sign in to the gcloud CLI with your federated identity. After signing in, initialize the Google Cloud CLI by running the following command:
```
gcloud init
```
Create local authentication credentials for your user account:
```
gcloud auth application-default login
```
If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

ログイン画面が表示されます。ログインすると、 ADC で使用されるローカル認証情報ファイルに認証情報が保存されます。

クライアントライブラリの使用

次の例は、クライアントライブラリの使用方法を示しています。

C++


#include "google/cloud/texttospeech/v1/text_to_speech_client.h"
#include <iostream>

auto constexpr kText = R"""(
Four score and seven years ago our fathers brought forth on this
continent, a new nation, conceived in Liberty, and dedicated to
the proposition that all men are created equal.)""";

int main(int argc, char* argv[]) try {
  if (argc != 1) {
    std::cerr << "Usage: " << argv[0] << "\n";
    return 1;
  }

  namespace texttospeech = ::google::cloud::texttospeech_v1;
  auto client = texttospeech::TextToSpeechClient(
      texttospeech::MakeTextToSpeechConnection());

  google::cloud::texttospeech::v1::SynthesisInput input;
  input.set_text(kText);
  google::cloud::texttospeech::v1::VoiceSelectionParams voice;
  voice.set_language_code("en-US");
  google::cloud::texttospeech::v1::AudioConfig audio;
  audio.set_audio_encoding(google::cloud::texttospeech::v1::LINEAR16);

  auto response = client.SynthesizeSpeech(input, voice, audio);
  if (!response) throw std::move(response).status();
  // Normally one would play the results (response->audio_content()) over some
  // audio device. For this quickstart, we just print some information.
  auto constexpr kWavHeaderSize = 48;
  auto constexpr kBytesPerSample = 2;  // we asked for LINEAR16
  auto const sample_count =
      (response->audio_content().size() - kWavHeaderSize) / kBytesPerSample;
  std::cout << "The audio has " << sample_count << " samples\n";

  return 0;
} catch (google::cloud::Status const& status) {
  std::cerr << "google::cloud::Status thrown: " << status << "\n";
  return 1;
}

Go


// Command quickstart generates an audio file with the content "Hello, World!".
package main

import (
	"context"
	"fmt"
	"log"
	"os"

	texttospeech "cloud.google.com/go/texttospeech/apiv1"
	"cloud.google.com/go/texttospeech/apiv1/texttospeechpb"
)

func main() {
	// Instantiates a client.
	ctx := context.Background()

	client, err := texttospeech.NewClient(ctx)
	if err != nil {
		log.Fatal(err)
	}
	defer client.Close()

	// Perform the text-to-speech request on the text input with the selected
	// voice parameters and audio file type.
	req := texttospeechpb.SynthesizeSpeechRequest{
		// Set the text input to be synthesized.
		Input: &texttospeechpb.SynthesisInput{
			InputSource: &texttospeechpb.SynthesisInput_Text{Text: "Hello, World!"},
		},
		// Build the voice request, select the language code ("en-US") and the SSML
		// voice gender ("neutral").
		Voice: &texttospeechpb.VoiceSelectionParams{
			LanguageCode: "en-US",
			SsmlGender:   texttospeechpb.SsmlVoiceGender_NEUTRAL,
		},
		// Select the type of audio file you want returned.
		AudioConfig: &texttospeechpb.AudioConfig{
			AudioEncoding: texttospeechpb.AudioEncoding_MP3,
		},
	}

	resp, err := client.SynthesizeSpeech(ctx, &req)
	if err != nil {
		log.Fatal(err)
	}

	// The resp's AudioContent is binary.
	filename := "output.mp3"
	err = os.WriteFile(filename, resp.AudioContent, 0644)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("Audio content written to file: %v\n", filename)
}

Java

// Imports the Google Cloud client library
import com.google.cloud.texttospeech.v1.AudioConfig;
import com.google.cloud.texttospeech.v1.AudioEncoding;
import com.google.cloud.texttospeech.v1.SsmlVoiceGender;
import com.google.cloud.texttospeech.v1.SynthesisInput;
import com.google.cloud.texttospeech.v1.SynthesizeSpeechResponse;
import com.google.cloud.texttospeech.v1.TextToSpeechClient;
import com.google.cloud.texttospeech.v1.VoiceSelectionParams;
import com.google.protobuf.ByteString;
import java.io.FileOutputStream;
import java.io.OutputStream;

/**
 * Google Cloud TextToSpeech API sample application. Example usage: mvn package exec:java
 * -Dexec.mainClass='com.example.texttospeech.QuickstartSample'
 */
public class QuickstartSample {

  /** Demonstrates using the Text-to-Speech API. */
  public static void main(String... args) throws Exception {
    // Instantiates a client
    try (TextToSpeechClient textToSpeechClient = TextToSpeechClient.create()) {
      // Set the text input to be synthesized
      SynthesisInput input = SynthesisInput.newBuilder().setText("Hello, World!").build();

      // Build the voice request, select the language code ("en-US") and the ssml voice gender
      // ("neutral")
      VoiceSelectionParams voice =
          VoiceSelectionParams.newBuilder()
              .setLanguageCode("en-US")
              .setSsmlGender(SsmlVoiceGender.NEUTRAL)
              .build();

      // Select the type of audio file you want returned
      AudioConfig audioConfig =
          AudioConfig.newBuilder().setAudioEncoding(AudioEncoding.MP3).build();

      // Perform the text-to-speech request on the text input with the selected voice parameters and
      // audio file type
      SynthesizeSpeechResponse response =
          textToSpeechClient.synthesizeSpeech(input, voice, audioConfig);

      // Get the audio contents from the response
      ByteString audioContents = response.getAudioContent();

      // Write the response to the output file.
      try (OutputStream out = new FileOutputStream("output.mp3")) {
        out.write(audioContents.toByteArray());
        System.out.println("Audio content written to file \"output.mp3\"");
      }
    }
  }
}

Node.js

// Imports the Google Cloud client library
const textToSpeech = require('@google-cloud/text-to-speech');

// Import other required libraries
const {writeFile} = require('node:fs/promises');

// Creates a client
const client = new textToSpeech.TextToSpeechClient();

async function quickStart() {
  // The text to synthesize
  const text = 'hello, world!';

  // Construct the request
  const request = {
    input: {text: text},
    // Select the language and SSML voice gender (optional)
    voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},
    // select the type of audio encoding
    audioConfig: {audioEncoding: 'MP3'},
  };

  // Performs the text-to-speech request
  const [response] = await client.synthesizeSpeech(request);

  // Save the generated binary audio content to a local file
  await writeFile('output.mp3', response.audioContent, 'binary');
  console.log('Audio content written to file: output.mp3');
}

await quickStart();

Python

"""Synthesizes speech from the input string of text or ssml.
Make sure to be working in a virtual environment.

Note: ssml must be well-formed according to:
    https://www.w3.org/TR/speech-synthesis/
"""
from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(text="Hello, World!")

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
    language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)

# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
    audio_encoding=texttospeech.AudioEncoding.MP3
)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
    input=synthesis_input, voice=voice, audio_config=audio_config
)

# The response's audio_content is binary.
with open("output.mp3", "wb") as out:
    # Write the response to the output file.
    out.write(response.audio_content)
    print('Audio content written to file "output.mp3"')

参考情報

C++

次のリストは、C++ のクライアントライブラリに関連するその他のリソースへのリンクを示します。

C#

次のリストは、C# のクライアントライブラリに関連するその他のリソースへのリンクを示します。

Go

次のリストは、Go のクライアントライブラリに関連するその他のリソースへのリンクを示します。

Java

次のリストは、Java のクライアントライブラリに関連するその他のリソースへのリンクを示します。

Node.js

次のリストは、Node.js のクライアントライブラリに関連するその他のリソースへのリンクを示します。

PHP

次のリストは、PHP のクライアントライブラリに関連するその他のリソースへのリンクを示します。

Python

次のリストは、Python のクライアントライブラリに関連するその他のリソースへのリンクを示します。

Ruby

次のリストは、Ruby のクライアントライブラリに関連するその他のリソースへのリンクを示します。

Text-to-Speech クライアント ライブラリ コレクションでコンテンツを整理 必要に応じて、コンテンツの保存と分類を行います。

クライアント ライブラリをインストールする

C++

C#

Go

Java

Node.js

PHP

Python

Ruby

認証を設定する

クライアント ライブラリの使用

C++

Go

Java

Node.js

Python

参考情報

C++

C#

Go

Java

Node.js

PHP

Python

Ruby

Text-to-Speech クライアントライブラリ

クライアントライブラリをインストールする

クライアントライブラリの使用