Speech-to-Text 用戶端程式庫

本頁說明如何開始使用 Speech-to-Text API 適用的 Cloud 用戶端程式庫。用戶端程式庫可讓您從支援的語言輕鬆存取Google Cloud API。雖然您可以直接向伺服器發出原始要求來使用Google Cloud API,但用戶端程式庫提供簡化功能,可大幅減少您需要編寫的程式碼數量。

如要進一步瞭解 Cloud 用戶端程式庫和舊版 Google API 用戶端程式庫,請參閱用戶端程式庫說明

安裝用戶端程式庫

C#

Install-Package Google.Cloud.Speech.V2

詳情請參閱設定 C# 開發環境

Go

go get cloud.google.com/go/speech/apiv2

詳情請參閱「設定 Go 開發環境」。

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.61.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-speech</artifactId>
  </dependency>
</dependencies>

If you are using Gradle, add the following to your dependencies:

implementation 'com.google.cloud:google-cloud-speech:4.61.0'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-speech" % "4.61.0"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

詳情請參閱「設定 Java 開發環境」。

Node.js

npm install @google-cloud/speech

詳情請參閱「設定 Node.js 開發環境」。

PHP

composer require google/cloud/speech

詳情請參閱「在 Google Cloud 上使用 PHP」。

Python

pip install --upgrade google-cloud-speech

詳情請參閱「設定 Python 開發環境」。

Ruby

gem install google-cloud-speech

詳情請參閱「設定 Ruby 開發環境」。

設定驗證方法

為驗證對 Google Cloud API 的呼叫,用戶端程式庫支援應用程式預設憑證 (ADC);程式庫會在定義的一組位置中尋找憑證,並使用這些憑證驗證對 API 的要求。使用 ADC,您可以在各種環境 (例如本機開發或正式版) 中,為應用程式提供憑證,不必修改應用程式程式碼。

在實際工作環境中,設定 ADC 的方式取決於服務和環境。詳情請參閱「設定應用程式預設憑證」。

在本地開發環境中,您可以使用與 Google 帳戶相關聯的憑證設定 ADC:

  1. After installing the Google Cloud CLI, initialize it by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  2. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

    畫面上會顯示登入畫面。登入後,您的憑證會儲存在 ADC 使用的 本機憑證檔案中。

使用用戶端程式庫

以下範例將說明用戶端程式庫的使用方法。

Java

// Imports the Google Cloud client library
import com.google.api.gax.longrunning.OperationFuture;
import com.google.cloud.speech.v2.AutoDetectDecodingConfig;
import com.google.cloud.speech.v2.CreateRecognizerRequest;
import com.google.cloud.speech.v2.OperationMetadata;
import com.google.cloud.speech.v2.RecognitionConfig;
import com.google.cloud.speech.v2.RecognizeRequest;
import com.google.cloud.speech.v2.RecognizeResponse;
import com.google.cloud.speech.v2.Recognizer;
import com.google.cloud.speech.v2.SpeechClient;
import com.google.cloud.speech.v2.SpeechRecognitionAlternative;
import com.google.cloud.speech.v2.SpeechRecognitionResult;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.concurrent.ExecutionException;

public class QuickstartSampleV2 {

  public static void main(String[] args) throws IOException, ExecutionException,
      InterruptedException {
    String projectId = "my-project-id";
    String filePath = "path/to/audioFile.raw";
    String recognizerId = "my-recognizer-id";
    quickstartSampleV2(projectId, filePath, recognizerId);
  }

  public static void quickstartSampleV2(String projectId, String filePath, String recognizerId)
      throws IOException, ExecutionException, InterruptedException {

    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (SpeechClient speechClient = SpeechClient.create()) {
      Path path = Paths.get(filePath);
      byte[] data = Files.readAllBytes(path);
      ByteString audioBytes = ByteString.copyFrom(data);

      String parent = String.format("projects/%s/locations/global", projectId);

      // First, create a recognizer
      Recognizer recognizer = Recognizer.newBuilder()
          .setModel("latest_long")
          .addLanguageCodes("en-US")
          .build();

      CreateRecognizerRequest createRecognizerRequest = CreateRecognizerRequest.newBuilder()
          .setParent(parent)
          .setRecognizerId(recognizerId)
          .setRecognizer(recognizer)
          .build();

      OperationFuture<Recognizer, OperationMetadata> operationFuture =
          speechClient.createRecognizerAsync(createRecognizerRequest);
      recognizer = operationFuture.get();

      // Next, create the transcription request
      RecognitionConfig recognitionConfig = RecognitionConfig.newBuilder()
          .setAutoDecodingConfig(AutoDetectDecodingConfig.newBuilder().build())
          .build();

      RecognizeRequest request = RecognizeRequest.newBuilder()
          .setConfig(recognitionConfig)
          .setRecognizer(recognizer.getName())
          .setContent(audioBytes)
          .build();

      RecognizeResponse response = speechClient.recognize(request);
      List<SpeechRecognitionResult> results = response.getResultsList();

      for (SpeechRecognitionResult result : results) {
        // There can be several alternative transcripts for a given chunk of speech. Just use the
        // first (most likely) one here.
        if (result.getAlternativesCount() > 0) {
          SpeechRecognitionAlternative alternative = result.getAlternativesList().get(0);
          System.out.printf("Transcription: %s%n", alternative.getTranscript());
        }
      }
    }
  }
}

Python

import os

from google.cloud.speech_v2 import SpeechClient
from google.cloud.speech_v2.types import cloud_speech

PROJECT_ID = os.getenv("GOOGLE_CLOUD_PROJECT")


def quickstart_v2(audio_file: str) -> cloud_speech.RecognizeResponse:
    """Transcribe an audio file.
    Args:
        audio_file (str): Path to the local audio file to be transcribed.
    Returns:
        cloud_speech.RecognizeResponse: The response from the recognize request, containing
        the transcription results
    """
    # Reads a file as bytes
    with open(audio_file, "rb") as f:
        audio_content = f.read()

    # Instantiates a client
    client = SpeechClient()

    config = cloud_speech.RecognitionConfig(
        auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),
        language_codes=["en-US"],
        model="long",
    )

    request = cloud_speech.RecognizeRequest(
        recognizer=f"projects/{PROJECT_ID}/locations/global/recognizers/_",
        config=config,
        content=audio_content,
    )

    # Transcribes the audio into text
    response = client.recognize(request=request)

    for result in response.results:
        print(f"Transcript: {result.alternatives[0].transcript}")

    return response

其他資源

C#

下列清單包含 C# 專用用戶端程式庫的相關資源連結:

Go

下列清單包含與 Go 專用用戶端程式庫相關的更多資源連結:

Java

以下列出與 Java 用戶端程式庫相關的更多資源連結:

Node.js

以下清單列出與 Node.js 用戶端程式庫相關的更多資源連結:

PHP

下列清單包含與 PHP 用戶端程式庫相關的更多資源連結:

Python

以下清單包含適用於 Python 的用戶端程式庫相關資源連結:

Ruby

以下清單提供與 Ruby 用戶端程式庫相關的更多資源連結: