BigQuery API 客户端库

本页面介绍了如何使用 BigQuery API 的 Cloud 客户端库。通过客户端库,您可以更轻松地使用支持的语言访问Google Cloud API。虽然您可以通过向服务器发出原始请求来直接使用Google Cloud API,但客户端库可实现简化,从而显著减少您需要编写的代码量。

请参阅客户端库说明,详细了解 Cloud 客户端库和旧版 Google API 客户端库。

安装客户端库

C#

Install-Package Google.Cloud.BigQuery.V2 -Pre

如需了解详情,请参阅设置 C# 开发环境

Go

go get cloud.google.com/go/bigquery

如需了解详情,请参阅设置 Go 开发环境

Java

If you are using Maven, add the following to your pom.xml file. For more information about BOMs, see The Google Cloud Platform Libraries BOM.

<!--  Using libraries-bom to manage versions.
See https://github.com/GoogleCloudPlatform/cloud-opensource-java/wiki/The-Google-Cloud-Platform-Libraries-BOM -->
<dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>com.google.cloud</groupId>
      <artifactId>libraries-bom</artifactId>
      <version>26.62.0</version>
      <type>pom</type>
      <scope>import</scope>
    </dependency>
  </dependencies>
</dependencyManagement>

<dependencies>
  <dependency>
    <groupId>com.google.cloud</groupId>
    <artifactId>google-cloud-bigquery</artifactId>
  </dependency>
</dependencies>

If you are using Gradle, add the following to your dependencies:

implementation platform('com.google.cloud:libraries-bom:26.45.0')

implementation 'com.google.cloud:google-cloud-bigquery'

If you are using sbt, add the following to your dependencies:

libraryDependencies += "com.google.cloud" % "google-cloud-bigquery" % "2.42.2"

If you're using Visual Studio Code, IntelliJ, or Eclipse, you can add client libraries to your project using the following IDE plugins:

The plugins provide additional functionality, such as key management for service accounts. Refer to each plugin's documentation for details.

如需了解详情,请参阅设置 Java 开发环境

Node.js

npm install @google-cloud/bigquery

如需了解详情,请参阅设置 Node.js 开发环境

PHP

composer require google/cloud-bigquery

如需了解详情,请参阅在 Google Cloud 上使用 PHP

Python

pip install --upgrade google-cloud-bigquery

如需了解详情,请参阅设置 Python 开发环境

Ruby

gem install google-cloud-bigquery

如需了解详情,请参阅设置 Ruby 开发环境

设置身份验证

为了对 Google Cloud API 的调用进行身份验证,客户端库支持应用默认凭据 (ADC);这些库会在一组指定的位置查找凭据,并使用这些凭据对发送到 API 的请求进行身份验证。借助 ADC,您可以在各种环境(例如本地开发或生产环境)中为您的应用提供凭据,而无需修改应用代码。

对于生产环境,设置 ADC 的方式取决于服务和上下文。如需了解详情,请参阅设置应用默认凭据

对于本地开发环境,您可以使用与您的 Google 账号关联的凭据设置 ADC:

  1. After installing the Google Cloud CLI, initialize it by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

  2. If you're using a local shell, then create local authentication credentials for your user account:

    gcloud auth application-default login

    You don't need to do this if you're using Cloud Shell.

    If an authentication error is returned, and you are using an external identity provider (IdP), confirm that you have signed in to the gcloud CLI with your federated identity.

    登录屏幕随即出现。在您登录后,您的凭据会存储在 ADC 使用的本地凭据文件中。

使用客户端库

以下示例展示了如何初始化客户端以及如何对 BigQuery API 公共数据集执行查询。

C#


using Google.Cloud.BigQuery.V2;
using System;

public class BigQueryQuery
{
    public void Query(
        string projectId = "your-project-id"
    )
    {
        BigQueryClient client = BigQueryClient.Create(projectId);
        string query = @"
            SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013`
            WHERE state = 'TX'
            LIMIT 100";
        BigQueryJob job = client.CreateQueryJob(
            sql: query,
            parameters: null,
            options: new QueryOptions { UseQueryCache = false });
        // Wait for the job to complete.
        job = job.PollUntilCompleted().ThrowOnAnyError();
        // Display the results
        foreach (BigQueryRow row in client.GetQueryResults(job.Reference))
        {
            Console.WriteLine($"{row["name"]}");
        }
    }
}

Go

import (
	"context"
	"fmt"
	"io"

	"cloud.google.com/go/bigquery"
	"google.golang.org/api/iterator"
)

// queryBasic demonstrates issuing a query and reading results.
func queryBasic(w io.Writer, projectID string) error {
	// projectID := "my-project-id"
	ctx := context.Background()
	client, err := bigquery.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("bigquery.NewClient: %v", err)
	}
	defer client.Close()

	q := client.Query(
		"SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` " +
			"WHERE state = \"TX\" " +
			"LIMIT 100")
	// Location must match that of the dataset(s) referenced in the query.
	q.Location = "US"
	// Run the query and print results when the query job is completed.
	job, err := q.Run(ctx)
	if err != nil {
		return err
	}
	status, err := job.Wait(ctx)
	if err != nil {
		return err
	}
	if err := status.Err(); err != nil {
		return err
	}
	it, err := job.Read(ctx)
	for {
		var row []bigquery.Value
		err := it.Next(&row)
		if err == iterator.Done {
			break
		}
		if err != nil {
			return err
		}
		fmt.Fprintln(w, row)
	}
	return nil
}

Java


import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.FieldValueList;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.JobId;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.QueryJobConfiguration;
import com.google.cloud.bigquery.TableResult;


public class SimpleApp {

  public static void main(String... args) throws Exception {
    // TODO(developer): Replace these variables before running the app.
    String projectId = "MY_PROJECT_ID";
    simpleApp(projectId);
  }

  public static void simpleApp(String projectId) {
    try {
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
      QueryJobConfiguration queryConfig =
          QueryJobConfiguration.newBuilder(
                  "SELECT CONCAT('https://stackoverflow.com/questions/', "
                      + "CAST(id as STRING)) as url, view_count "
                      + "FROM `bigquery-public-data.stackoverflow.posts_questions` "
                      + "WHERE tags like '%google-bigquery%' "
                      + "ORDER BY view_count DESC "
                      + "LIMIT 10")
              // Use standard SQL syntax for queries.
              // See: https://cloud.google.com/bigquery/sql-reference/
              .setUseLegacySql(false)
              .build();

      JobId jobId = JobId.newBuilder().setProject(projectId).build();
      Job queryJob = bigquery.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build());

      // Wait for the query to complete.
      queryJob = queryJob.waitFor();

      // Check for errors
      if (queryJob == null) {
        throw new RuntimeException("Job no longer exists");
      } else if (queryJob.getStatus().getExecutionErrors() != null
          && queryJob.getStatus().getExecutionErrors().size() > 0) {
        // TODO(developer): Handle errors here. An error here do not necessarily mean that the job
        // has completed or was unsuccessful.
        // For more details: https://cloud.google.com/bigquery/troubleshooting-errors
        throw new RuntimeException("An unhandled error has occurred");
      }

      // Get the results.
      TableResult result = queryJob.getQueryResults();

      // Print all pages of the results.
      for (FieldValueList row : result.iterateAll()) {
        // String type
        String url = row.get("url").getStringValue();
        String viewCount = row.get("view_count").getStringValue();
        System.out.printf("%s : %s views\n", url, viewCount);
      }
    } catch (BigQueryException | InterruptedException e) {
      System.out.println("Simple App failed due to error: \n" + e.toString());
    }
  }
}

Node.js

// Import the Google Cloud client library using default credentials
const {BigQuery} = require('@google-cloud/bigquery');
const bigquery = new BigQuery();
async function query() {
  // Queries the U.S. given names dataset for the state of Texas.

  const query = `SELECT name
    FROM \`bigquery-public-data.usa_names.usa_1910_2013\`
    WHERE state = 'TX'
    LIMIT 100`;

  // For all options, see https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query
  const options = {
    query: query,
    // Location must match that of the dataset(s) referenced in the query.
    location: 'US',
  };

  // Run the query as a job
  const [job] = await bigquery.createQueryJob(options);
  console.log(`Job ${job.id} started.`);

  // Wait for the query to finish
  const [rows] = await job.getQueryResults();

  // Print the results
  console.log('Rows:');
  rows.forEach(row => console.log(row));
}

PHP

use Google\Cloud\BigQuery\BigQueryClient;
use Google\Cloud\Core\ExponentialBackoff;

/** Uncomment and populate these variables in your code */
// $projectId = 'The Google project ID';
// $query = 'SELECT id, view_count FROM `bigquery-public-data.stackoverflow.posts_questions`';

$bigQuery = new BigQueryClient([
    'projectId' => $projectId,
]);
$jobConfig = $bigQuery->query($query);
$job = $bigQuery->startQuery($jobConfig);

$backoff = new ExponentialBackoff(10);
$backoff->execute(function () use ($job) {
    print('Waiting for job to complete' . PHP_EOL);
    $job->reload();
    if (!$job->isComplete()) {
        throw new Exception('Job has not yet completed', 500);
    }
});
$queryResults = $job->queryResults();

$i = 0;
foreach ($queryResults as $row) {
    printf('--- Row %s ---' . PHP_EOL, ++$i);
    foreach ($row as $column => $value) {
        printf('%s: %s' . PHP_EOL, $column, json_encode($value));
    }
}
printf('Found %s row(s)' . PHP_EOL, $i);

Python

from google.cloud import bigquery

# Construct a BigQuery client object.
client = bigquery.Client()

query = """
    SELECT name, SUM(number) as total_people
    FROM `bigquery-public-data.usa_names.usa_1910_2013`
    WHERE state = 'TX'
    GROUP BY name, state
    ORDER BY total_people DESC
    LIMIT 20
"""
rows = client.query_and_wait(query)  # Make an API request.

print("The query data:")
for row in rows:
    # Row values can be accessed by field name or index.
    print("name={}, count={}".format(row[0], row["total_people"]))

Ruby

require "google/cloud/bigquery"

def query
  bigquery = Google::Cloud::Bigquery.new
  sql = "SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` " \
        "WHERE state = 'TX' " \
        "LIMIT 100"

  # Location must match that of the dataset(s) referenced in the query.
  results = bigquery.query sql do |config|
    config.location = "US"
  end

  results.each do |row|
    puts row.inspect
  end
end

其他资源

C#

以下列表包含与 C# 版客户端库相关的更多资源的链接:

Go

以下列表包含与 Go 版客户端库相关的更多资源的链接:

Java

以下列表包含与 Java 版客户端库相关的更多资源的链接:

Node.js

以下列表包含与 Node.js 版客户端库相关的更多资源的链接:

PHP

以下列表包含与 PHP 版客户端库相关的更多资源的链接:

Python

以下列表包含与 Python 版客户端库相关的更多资源的链接:

Ruby

以下列表包含与 Ruby 版客户端库相关的更多资源的链接:

第三方 BigQuery API 客户端库

除了上表列出的 Google 支持的客户端库外,还可使用一组第三方库。

语言
Python pandas-gbq使用指南)、ibis教程
R bigrqueryBigQueryR
Scala spark-bigquery-connector

后续步骤

自行试用

如果您是 Google Cloud 新手,请创建一个账号来评估 BigQuery 在实际场景中的表现。新客户还可获享 $300 赠金,用于运行、测试和部署工作负载。

免费试用 BigQuery