Speech-to-Text 用戶端程式庫

本頁說明如何開始使用 Speech-to-Text API 適用的 Cloud 用戶端程式庫。用戶端程式庫可讓您從支援的語言輕鬆存取Google Cloud API。雖然您可以直接向伺服器發出原始要求來使用Google Cloud API,但用戶端程式庫提供簡化功能,可大幅減少您需要編寫的程式碼數量。

如要進一步瞭解 Cloud 用戶端程式庫和舊版 Google API 用戶端程式庫,請參閱用戶端程式庫說明

安裝用戶端程式庫

C#

如果您使用 Visual Studio 2017 以上版本,請開啟 NuGet 套件管理員視窗,然後輸入下列內容:

Install-Package Google.Apis

如果您使用 .NET Core 指令列介面工具安裝依附元件,請執行下列指令:

dotnet add package Google.Apis

詳情請參閱設定 C# 開發環境

Go

go get cloud.google.com/go/speech/apiv1

詳情請參閱「設定 Go 開發環境」。

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.61.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-speech</artifactId>
  </dependency>
</dependencies>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-speech:4.61.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-speech" % "4.61.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

詳情請參閱「設定 Java 開發環境」。

Node.js

npm install @google-cloud/speech

詳情請參閱「設定 Node.js 開發環境」。

PHP

composer require google/apiclient

詳情請參閱「在 Google Cloud 上使用 PHP」。

Python

pip install --upgrade google-cloud-speech

詳情請參閱「設定 Python 開發環境」。

Ruby

gem install google-api-client

詳情請參閱「設定 Ruby 開發環境」。

設定驗證方法

為驗證對 Google Cloud API 的呼叫,用戶端程式庫支援應用程式預設憑證 (ADC);程式庫會在定義的一組位置中尋找憑證,並使用這些憑證驗證對 API 的要求。使用 ADC,您可以在各種環境 (例如本機開發或正式版) 中,為應用程式提供憑證,不必修改應用程式程式碼。

在實際工作環境中,設定 ADC 的方式取決於服務和環境。詳情請參閱「設定應用程式預設憑證」。

在本地開發環境中,您可以使用與 Google 帳戶相關聯的憑證設定 ADC:

  1. After installing the Google Cloud CLI, initialize it by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  2. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

    畫面上會顯示登入畫面。登入後,您的憑證會儲存在 ADC 使用的 本機憑證檔案中。

使用用戶端程式庫

以下範例將說明用戶端程式庫的使用方法。

Go


// Sample speech-quickstart uses the Google Cloud Speech API to transcribe
// audio.
package main

import (
	"context"
	"fmt"
	"log"

	speech "cloud.google.com/go/speech/apiv1"
	"cloud.google.com/go/speech/apiv1/speechpb"
)

func main() {
	ctx := context.Background()

	// Creates a client.
	client, err := speech.NewClient(ctx)
	if err != nil {
		log.Fatalf("Failed to create client: %v", err)
	}
	defer client.Close()

	// The path to the remote audio file to transcribe.
	fileURI := "gs://cloud-samples-data/speech/brooklyn_bridge.raw"

	// Detects speech in the audio file.
	resp, err := client.Recognize(ctx, &speechpb.RecognizeRequest{
		Config: &speechpb.RecognitionConfig{
			Encoding:        speechpb.RecognitionConfig_LINEAR16,
			SampleRateHertz: 16000,
			LanguageCode:    "en-US",
		},
		Audio: &speechpb.RecognitionAudio{
			AudioSource: &speechpb.RecognitionAudio_Uri{Uri: fileURI},
		},
	})
	if err != nil {
		log.Fatalf("failed to recognize: %v", err)
	}

	// Prints the results.
	for _, result := range resp.Results {
		for _, alt := range result.Alternatives {
			fmt.Printf("\"%v\" (confidence=%3f)\n", alt.Transcript, alt.Confidence)
		}
	}
}

Java

// Imports the Google Cloud client library
import com.google.cloud.speech.v1.RecognitionAudio;
import com.google.cloud.speech.v1.RecognitionConfig;
import com.google.cloud.speech.v1.RecognitionConfig.AudioEncoding;
import com.google.cloud.speech.v1.RecognizeResponse;
import com.google.cloud.speech.v1.SpeechClient;
import com.google.cloud.speech.v1.SpeechRecognitionAlternative;
import com.google.cloud.speech.v1.SpeechRecognitionResult;
import java.util.List;

public class QuickstartSample {

  /** Demonstrates using the Speech API to transcribe an audio file. */
  public static void main(String... args) throws Exception {
    // Instantiates a client
    try (SpeechClient speechClient = SpeechClient.create()) {

      // The path to the audio file to transcribe
      String gcsUri = "gs://cloud-samples-data/speech/brooklyn_bridge.raw";

      // Builds the sync recognize request
      RecognitionConfig config =
          RecognitionConfig.newBuilder()
              .setEncoding(AudioEncoding.LINEAR16)
              .setSampleRateHertz(16000)
              .setLanguageCode("en-US")
              .build();
      RecognitionAudio audio = RecognitionAudio.newBuilder().setUri(gcsUri).build();

      // Performs speech recognition on the audio file
      RecognizeResponse response = speechClient.recognize(config, audio);
      List<SpeechRecognitionResult> results = response.getResultsList();

      for (SpeechRecognitionResult result : results) {
        // There can be several alternative transcripts for a given chunk of speech. Just use the
        // first (most likely) one here.
        SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
        System.out.printf("Transcription: %s%n", alternative.getTranscript());
      }
    }
  }
}

Node.js

// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');

// Creates a client
const client = new speech.SpeechClient();

async function quickstart() {
  // The path to the remote LINEAR16 file
  const gcsUri = 'gs://cloud-samples-data/speech/brooklyn_bridge.raw';

  // The audio file's encoding, sample rate in hertz, and BCP-47 language code
  const audio = {
    uri: gcsUri,
  };
  const config = {
    encoding: 'LINEAR16',
    sampleRateHertz: 16000,
    languageCode: 'en-US',
  };
  const request = {
    audio: audio,
    config: config,
  };

  // Detects speech in the audio file
  const [response] = await client.recognize(request);
  const transcription = response.results
    .map(result => result.alternatives[0].transcript)
    .join('\n');
  console.log(`Transcription: ${transcription}`);
}
quickstart();

Python


# Imports the Google Cloud client library


from google.cloud import speech



def run_quickstart() -> speech.RecognizeResponse:
    # Instantiates a client
    client = speech.SpeechClient()

    # The name of the audio file to transcribe
    gcs_uri = "gs://cloud-samples-data/speech/brooklyn_bridge.raw"

    audio = speech.RecognitionAudio(uri=gcs_uri)

    config = speech.RecognitionConfig(
        encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=16000,
        language_code="en-US",
    )

    # Detects speech in the audio file
    response = client.recognize(config=config, audio=audio)

    for result in response.results:
        print(f"Transcript: {result.alternatives[0].transcript}")

其他資源

C#

以下清單列出與 C# 用戶端程式庫相關的更多資源連結:

Go

下列清單包含與 Go 專用用戶端程式庫相關的更多資源連結:

Java

以下列出與 Java 用戶端程式庫相關的更多資源連結:

Node.js

以下清單列出與 Node.js 用戶端程式庫相關的更多資源連結:

PHP

下列清單包含與 PHP 用戶端程式庫相關的更多資源連結:

Python

以下清單包含適用於 Python 的用戶端程式庫相關資源連結:

Ruby

以下清單提供與 Ruby 用戶端程式庫相關的更多資源連結: