本頁面由 Cloud Translation API 翻譯而成。

將資料表資料匯出至 Cloud Storage

本頁面說明如何將資料從 BigQuery 表格匯出或擷取至 Cloud Storage。

將資料載入 BigQuery 後，您就可以匯出數種格式的資料。BigQuery 對每個檔案最多可匯出 1 GB 資料。如果要匯出的資料超過 1 GB，就必須將資料分別匯出成多個檔案。將資料匯出至多個檔案時，各檔案的大小會有所差異。

您也可以使用 EXPORT DATA 陳述式匯出查詢結果。您可以使用 EXPORT DATA OPTIONS 指定匯出資料的格式。

最後，您可以使用 Dataflow 等服務從 BigQuery 讀取資料，而非從 BigLake 匯出資料。如要進一步瞭解如何使用 Dataflow 讀取 BigQuery 的資料及將資料寫入 BigQuery，請參閱 BigQuery I/O 說明文件。

匯出限制

當您從 BigQuery 匯出資料時，請注意以下幾點：

您無法將資料表資料匯出至本機檔案、Google 試算表或 Google 雲端硬碟。唯一支援的匯出位置是 Cloud Storage。如需儲存查詢結果的相關資訊，請查看下載並儲存查詢結果一節。
您最多可將 1 GB 的資料表資料匯出至單一檔案。如果您匯出的資料超過 1 GB，請使用萬用字元將資料匯出到多個檔案。當您將資料匯出成多個檔案時，各個檔案的大小會有所差異。如要限制匯出檔案大小，您可以將資料分割，然後匯出每個分割區。
使用 EXPORT DATA 陳述式時，系統無法保證產生的檔案大小。
匯出作業產生的檔案數量可能有所不同。
您無法將巢狀與重複資料匯出成 CSV 格式。Avro、JSON 和 Parquet 匯出支援巢狀與重複資料。
匯出 JSON 格式的資料時，系統會將 INT64 (整數) 資料類型編碼為 JSON 字串，以便該資料讓其他系統讀取時能保留 64 位元精確度。
您無法在單一匯出工作中，從多個資料表匯出資料。
使用 Google Cloud 主控台匯出資料時，您無法選擇 GZIP 以外的壓縮類型。
當您以 JSON 格式匯出資料表時，系統會使用萬國碼符號 \uNNNN 轉換符號 <、> 和 &，其中 N 是十六進制數字。舉例來說，profit&loss 會變為 profit\u0026loss。這項萬國碼轉換作業是為了避免安全漏洞。
除非您使用 EXPORT DATA 陳述式，並在 query_statement 中指定 ORDER BY 子句，否則無法保證匯出的資料表資料順序。
BigQuery 不支援 Cloud Storage 資源路徑在初始雙斜線後還有多個連續斜線。Cloud Storage 物件名稱可以包含多個連續的斜線 (「/」) 字元，但 BigQuery 會將多個連續斜線轉換為一個斜線。舉例來說，下列資源路徑在 Cloud Storage 中有效，但在 BigQuery 中則無效：gs://bucket/my//object//name。
匯出工作執行期間載入至 BigQuery 的任何新資料，都不會納入該匯出工作。您必須建立新的匯出工作，才能匯出新資料。

事前準備

授予身分與存取權管理 (IAM) 角色，讓使用者取得執行本文件中各項工作的必要權限。

所需權限

如要執行本文中的任務，您必須具備下列權限。

從 BigQuery 資料表匯出資料的權限

如要從 BigQuery 資料表匯出資料，您必須具備 bigquery.tables.export IAM 權限。

以下預先定義的 IAM 角色都包含 bigquery.tables.export 權限：

roles/bigquery.dataViewer
roles/bigquery.dataOwner
roles/bigquery.dataEditor
roles/bigquery.admin

執行匯出工作的權限

如要執行匯出工作，您必須具備 bigquery.jobs.create IAM 權限。

以下每個預先定義的 IAM 角色都包含執行匯出工作所需的權限：

roles/bigquery.user
roles/bigquery.jobUser
roles/bigquery.admin

將資料寫入 Cloud Storage 值區的權限

如要將資料寫入現有的 Cloud Storage 值區，您需要具備下列 IAM 權限：

storage.objects.create
storage.objects.delete

以下每個預先定義的 IAM 角色都包含將資料寫入現有 Cloud Storage 值區所需的權限：

roles/storage.objectAdmin
roles/storage.admin

如要進一步瞭解 BigQuery 中的 IAM 角色和權限，請參閱「預先定義的角色與權限」一文。

匯出格式與壓縮類型

BigQuery 支援下列匯出資料用的資料格式與壓縮類型：

資料格式	支援的壓縮類型	說明
CSV	GZIP	您可以使用 `--field_delimiter` bq 指令列工具旗標，或是 `configuration.extract.fieldDelimiter` 提取工作屬性，控制已匯出資料中的 CSV 分隔符號。不支援巢狀與重複資料。
JSON	GZIP	支援巢狀與重複資料。
Avro	DEFLATE、SNAPPY	Avro 不支援以 GZIP 格式匯出項目。支援巢狀與重複資料。請參閱 Avro 匯出詳細資料。
Parquet	SNAPPY、GZIP、ZSTD	支援巢狀與重複資料。請參閱 Parquet 匯出詳細資料。

匯出資料

請透過下列方式匯出資料表的資料：

使用 Google Cloud 主控台
在 bq 指令列工具中使用 bq extract 指令
使用 API 或用戶端程式庫提交 extract 工作

匯出資料表資料

從 BigQuery 資料表匯出資料：

主控台

在 Google Cloud 控制台開啟「BigQuery」頁面。

前往「BigQuery」頁面
在「Explorer」面板中展開專案和資料集，然後選取資料表。
在詳細資料面板中，按一下「匯出」，然後選取「匯出至 Cloud Storage」。
在「Export table to Google Cloud Storage」(將資料表匯出至 Google Cloud Storage) 對話方塊中：
- 針對「Select Google Cloud Storage location」(選取 Google Cloud Storage 位置)，請瀏覽至您要匯出資料的值區、資料夾或檔案。
- 在「Export format」(匯出格式) 中，選擇匯出資料的格式：CSV、JSON (以換行符號分隔)、Avro 或 Parquet。
- 針對「Compression」，選取壓縮格式，或選取 None 以便不壓縮。
- 按一下「Save」即可匯出資料表。

如要查看工作進度，請展開「Job history」窗格，然後尋找「EXTRACT」類型的工作。

如要將檢視畫面匯出至 Cloud Storage，請使用 EXPORT DATA OPTIONS 陳述式。

SQL

使用 EXPORT DATA 陳述式。以下範例會從名為 mydataset.table1 的資料表中匯出所選欄位：

前往 Google Cloud 控制台的「BigQuery」頁面。

前往 BigQuery

在查詢編輯器中輸入以下陳述式：

EXPORT DATA
  OPTIONS (
    uri = 'gs://bucket/folder/*.csv',
    format = 'CSV',
    overwrite = true,
    header = true,
    field_delimiter = ';')
AS (
  SELECT field1, field2
  FROM mydataset.table1
  ORDER BY field1
);

按一下「Run」。

如要進一步瞭解如何執行查詢，請參閱「執行互動式查詢」一文。

bq

使用加上 --destination_format 旗標的 bq extract 指令。

(選用) 提供 --location 旗標，並將值設為您的位置。

其他選用標記包括：

--compression：匯出檔案所用的壓縮類型。
--field_delimiter：匯出作業在採用 CSV 格式的輸出檔案中，用來表示不同資料欄之間界線的字元。\t 和 tab 都可用來表示 Tab 字元分隔。
--print_header：如果有指定該旗標，系統在列印有標頭的格式 (例如 CSV) 時，就會列印標頭列。

bq extract --location=location \
--destination_format format \
--compression compression_type \
--field_delimiter delimiter \
--print_header=boolean \
project_id:dataset.table \
gs://bucket/filename.ext

其中：

location 是您的位置名稱。--location 是選用旗標。舉例來說，如果您在東京地區使用 BigQuery，就可以將旗標的值設為 asia-northeast1。您可以使用 .bigqueryrc 檔案設定位置的預設值。
format 是資料匯出格式：CSV、NEWLINE_DELIMITED_JSON、AVRO 或 PARQUET。
compression_type：所選資料格式支援的壓縮類型。請參閱「匯出格式和壓縮類型」。
delimiter：在 CSV 匯出檔案中，用來表示不同資料欄之間界線的字元。\t 和 tab 都是可接受的 Tab 分隔名稱。
boolean 為 true 或 false。設定為 true 時，如果資料格式支援標頭，系統在列印匯出的資料時就會列印標頭列。預設值為 true。
project_id 是您的專案 ID。
dataset 是來源資料集的名稱。
table 是您要匯出的資料表。如果您使用分區修飾符，則必須在資料表路徑前後加上半形單引號，或將 $ 字元轉義。
bucket 是匯出資料的目標 Cloud Storage 值區名稱。BigQuery 資料集與 Cloud Storage 值區必須位於相同的位置。
filename.ext 是匯出資料檔案的名稱和副檔名。您可以使用萬用字元，將資料匯出至多個檔案。

範例：

舉例來說，下列指令會把 mydataset.mytable 匯出成名為 myfile.csv 的 gzip 壓縮檔，而 myfile.csv 會儲存在名為 example-bucket 的 Cloud Storage 值區中。

bq extract \
--compression GZIP \
'mydataset.mytable' \
gs://example-bucket/myfile.csv

預設目的地格式為 CSV。如要匯出為 JSON 或 Avro 格式，請使用 destination_format 旗標並將其設為 NEWLINE_DELIMITED_JSON 或 AVRO。例如：

bq extract \
--destination_format NEWLINE_DELIMITED_JSON \
'mydataset.mytable' \
gs://example-bucket/myfile.json

下列指令會把 mydataset.mytable 匯出成採用 Snappy 壓縮類型的 Avro 格式檔案，檔案名稱為 myfile.avro。而系統會把 myfile.avro 匯出到名為 example-bucket 的 Cloud Storage 值區。

bq extract \
--destination_format AVRO \
--compression SNAPPY \
'mydataset.mytable' \
gs://example-bucket/myfile.avro

下列指令會將 mydataset.my_partitioned_table 的單一分區匯出至 Cloud Storage 中的 CSV 檔案：

bq extract \
--destination_format CSV \
'mydataset.my_partitioned_table$0' \
gs://example-bucket/single_partition.csv

API

如要匯出資料，請建立 extract 工作，並填入工作設定。

(選擇性操作) 在工作資源的 jobReference 區段中，於 location 屬性內指定您的位置。

建立指向 BigQuery 來源資料與 Cloud Storage 目的地的擷取工作。
指定來源資料表，方法是使用包含專案 ID、資料集 ID 和資料表 ID 的 sourceTable 設定物件。
destination URI(s) 屬性必須完整且符合下列格式：gs://bucket/filename.ext。每個 URI 都可以包含一個「*」萬用字元，且必須出現在值區名稱之後。
設定 configuration.extract.destinationFormat 屬性以指定資料格式。舉例來說，如要匯出 JSON 檔案，請將此屬性值設為 NEWLINE_DELIMITED_JSON。
如要查看工作狀態，請使用初始要求傳回的工作 ID 呼叫 jobs.get(job_id)。
- 如果是 status.state = DONE，代表工作已順利完成。
- 如果出現 status.errorResult 屬性，代表要求執行失敗，且該物件將包含所發生錯誤的相關訊息。
- 如果沒有出現 status.errorResult，代表工作已順利完成，但過程中可能發生了幾個不嚴重的錯誤。不嚴重的錯誤都會列在已傳回工作物件的 status.errors 屬性中。

API 附註：

最佳做法就是產生唯一識別碼，並在呼叫 jobs.insert 來建立工作時，將該唯一識別碼當做 jobReference.jobId 傳送。這個方法較不受網路故障問題的影響，因為用戶端可對已知的工作 ID 進行輪詢或重試。
針對指定的工作 ID 呼叫 jobs.insert 算是種冪等運算；換句話說，您可以針對同一個工作 ID 重試作業無數次，但在這些作業中最多只會有一個成功。

C#

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 C# 設定說明進行操作。詳情請參閱 BigQuery C# API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。


using Google.Cloud.BigQuery.V2;
using System;

public class BigQueryExtractTable
{
    public void ExtractTable(
        string projectId = "your-project-id",
        string bucketName = "your-bucket-name")
    {
        BigQueryClient client = BigQueryClient.Create(projectId);
        // Define a destination URI. Use a single wildcard URI if you think
        // your exported data will be larger than the 1 GB maximum value.
        string destinationUri = $"gs://{bucketName}/shakespeare-*.csv";
        BigQueryJob job = client.CreateExtractJob(
            projectId: "bigquery-public-data",
            datasetId: "samples",
            tableId: "shakespeare",
            destinationUri: destinationUri
        );
        job = job.PollUntilCompleted().ThrowOnAnyError();  // Waits for the job to complete.
        Console.Write($"Exported table to {destinationUri}.");
    }
}

Go

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Go 設定說明進行操作。詳情請參閱 BigQuery Go API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

import (
	"context"
	"fmt"

	"cloud.google.com/go/bigquery"
)

// exportTableAsCompressedCSV demonstrates using an export job to
// write the contents of a table into Cloud Storage as CSV.
func exportTableAsCSV(projectID, gcsURI string) error {
	// projectID := "my-project-id"
	// gcsUri := "gs://mybucket/shakespeare.csv"
	ctx := context.Background()
	client, err := bigquery.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("bigquery.NewClient: %v", err)
	}
	defer client.Close()

	srcProject := "bigquery-public-data"
	srcDataset := "samples"
	srcTable := "shakespeare"

	gcsRef := bigquery.NewGCSReference(gcsURI)
	gcsRef.FieldDelimiter = ","

	extractor := client.DatasetInProject(srcProject, srcDataset).Table(srcTable).ExtractorTo(gcsRef)
	extractor.DisableHeader = true
	// You can choose to run the job in a specific location for more complex data locality scenarios.
	// Ex: In this example, source dataset and GCS bucket are in the US.
	extractor.Location = "US"

	job, err := extractor.Run(ctx)
	if err != nil {
		return err
	}
	status, err := job.Wait(ctx)
	if err != nil {
		return err
	}
	if err := status.Err(); err != nil {
		return err
	}
	return nil
}

Java

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Java 設定說明進行操作。詳情請參閱 BigQuery Java API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

import com.google.cloud.RetryOption;
import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.Table;
import com.google.cloud.bigquery.TableId;
import org.threeten.bp.Duration;

public class ExtractTableToCsv {

  public static void runExtractTableToCsv() {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "bigquery-public-data";
    String datasetName = "samples";
    String tableName = "shakespeare";
    String bucketName = "my-bucket";
    String destinationUri = "gs://" + bucketName + "/path/to/file";
    // For more information on export formats available see:
    // https://cloud.google.com/bigquery/docs/exporting-data#export_formats_and_compression_types
    // For more information on Job see:
    // https://googleapis.dev/java/google-cloud-clients/latest/index.html?com/google/cloud/bigquery/package-summary.html

    String dataFormat = "CSV";
    extractTableToCsv(projectId, datasetName, tableName, destinationUri, dataFormat);
  }

  // Exports datasetName:tableName to destinationUri as raw CSV
  public static void extractTableToCsv(
      String projectId,
      String datasetName,
      String tableName,
      String destinationUri,
      String dataFormat) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      TableId tableId = TableId.of(projectId, datasetName, tableName);
      Table table = bigquery.getTable(tableId);

      Job job = table.extract(dataFormat, destinationUri);

      // Blocks until this job completes its execution, either failing or succeeding.
      Job completedJob =
          job.waitFor(
              RetryOption.initialRetryDelay(Duration.ofSeconds(1)),
              RetryOption.totalTimeout(Duration.ofMinutes(3)));
      if (completedJob == null) {
        System.out.println("Job not executed since it no longer exists.");
        return;
      } else if (completedJob.getStatus().getError() != null) {
        System.out.println(
            "BigQuery was unable to extract due to an error: \n" + job.getStatus().getError());
        return;
      }
      System.out.println(
          "Table export successful. Check in GCS bucket for the " + dataFormat + " file.");
    } catch (BigQueryException | InterruptedException e) {
      System.out.println("Table extraction job was interrupted. \n" + e.toString());
    }
  }
}

Node.js

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Node.js 設定說明進行操作。詳情請參閱 BigQuery Node.js API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

// Import the Google Cloud client libraries
const {BigQuery} = require('@google-cloud/bigquery');
const {Storage} = require('@google-cloud/storage');

const bigquery = new BigQuery();
const storage = new Storage();

async function extractTableToGCS() {
  // Exports my_dataset:my_table to gcs://my-bucket/my-file as raw CSV.

  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
  // const datasetId = "my_dataset";
  // const tableId = "my_table";
  // const bucketName = "my-bucket";
  // const filename = "file.csv";

  // Location must match that of the source table.
  const options = {
    location: 'US',
  };

  // Export data from the table into a Google Cloud Storage file
  const [job] = await bigquery
    .dataset(datasetId)
    .table(tableId)
    .extract(storage.bucket(bucketName).file(filename), options);

  console.log(`Job ${job.id} created.`);

  // Check the job's status for errors
  const errors = job.status.errors;
  if (errors && errors.length > 0) {
    throw errors;
  }
}

PHP

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 PHP 設定說明進行操作。詳情請參閱 BigQuery PHP API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

use Google\Cloud\BigQuery\BigQueryClient;

/**
 * Extracts the given table as json to given GCS bucket.
 *
 * @param string $projectId The project Id of your Google Cloud Project.
 * @param string $datasetId The BigQuery dataset ID.
 * @param string $tableId The BigQuery table ID.
 * @param string $bucketName Bucket name in Google Cloud Storage
 */
function extract_table(
    string $projectId,
    string $datasetId,
    string $tableId,
    string $bucketName
): void {
    $bigQuery = new BigQueryClient([
      'projectId' => $projectId,
    ]);
    $dataset = $bigQuery->dataset($datasetId);
    $table = $dataset->table($tableId);
    $destinationUri = "gs://{$bucketName}/{$tableId}.json";
    // Define the format to use. If the format is not specified, 'CSV' will be used.
    $format = 'NEWLINE_DELIMITED_JSON';
    // Create the extract job
    $extractConfig = $table->extract($destinationUri)->destinationFormat($format);
    // Run the job
    $job = $table->runJob($extractConfig);  // Waits for the job to complete
    printf('Exported %s to %s' . PHP_EOL, $table->id(), $destinationUri);
}

Python

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Python 設定說明進行操作。詳情請參閱 BigQuery Python API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

# from google.cloud import bigquery
# client = bigquery.Client()
# bucket_name = 'my-bucket'
project = "bigquery-public-data"
dataset_id = "samples"
table_id = "shakespeare"

destination_uri = "gs://{}/{}".format(bucket_name, "shakespeare.csv")
dataset_ref = bigquery.DatasetReference(project, dataset_id)
table_ref = dataset_ref.table(table_id)

extract_job = client.extract_table(
    table_ref,
    destination_uri,
    # Location must match that of the source table.
    location="US",
)  # API request
extract_job.result()  # Waits for job to complete.

print(
    "Exported {}:{}.{} to {}".format(project, dataset_id, table_id, destination_uri)
)

Ruby

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Ruby 設定說明進行操作。詳情請參閱 BigQuery Ruby API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

require "google/cloud/bigquery"

def extract_table bucket_name = "my-bucket",
                  dataset_id  = "my_dataset_id",
                  table_id    = "my_table_id"
  bigquery = Google::Cloud::Bigquery.new
  dataset  = bigquery.dataset dataset_id
  table    = dataset.table    table_id

  # Define a destination URI. Use a single wildcard URI if you think
  # your exported data will be larger than the 1 GB maximum value.
  destination_uri = "gs://#{bucket_name}/output-*.csv"

  extract_job = table.extract_job destination_uri do |config|
    # Location must match that of the source table.
    config.location = "US"
  end
  extract_job.wait_until_done! # Waits for the job to complete

  puts "Exported #{table.id} to #{destination_uri}"
end

匯出資料表中繼資料

如要從 Iceberg 資料表匯出資料表中繼資料，請使用下列 SQL 陳述式：

EXPORT TABLE METADATA FROM `[[PROJECT_NAME.]DATASET_NAME.]TABLE_NAME`;

更改下列內容：

PROJECT_NAME：資料表的專案名稱。這個值的預設值為執行這項查詢的專案。
DATASET_NAME：資料表的資料集名稱。
TABLE_NAME：資料表名稱。

匯出的中繼資料位於 STORAGE_URI/metadata 資料夾中，其中 STORAGE_URI 是選項中設定的表格儲存位置。

Avro 匯出詳細資料

BigQuery 可以透過以下方式表示 Avro 格式的資料：

結果匯出檔案是 Avro 容器檔案。
每個 BigQuery 資料列都會表示為一筆 Avro 記錄。巢狀資料會以巢狀記錄物件來表示。
REQUIRED 欄位會表示為對應 Avro 類型。舉例來說，BigQuery 的 INTEGER 類型就會對應到 Avro 的 LONG 類型。
NULLABLE 欄位會表示為對應類型的 Avro Union 與「空值」。
REPEATED 欄位會表示為 Avro 陣列。
在擷取工作和匯出資料 SQL 中，TIMESTAMP 資料類型預設會以 timestamp-micros 邏輯類型表示 (會標註 Avro LONG 類型)。(注意：您可以將 use_avro_logical_types=False 新增至 Export Data Options 來停用邏輯類型，讓系統使用 string 類型取代時間戳記欄，但在擷取工作中，系統一律會使用 Avro 邏輯類型)。
在匯出資料 SQL 中，DATE 資料類型預設會以 date 邏輯類型表示 (會標註 Avro INT 類型)，但在擷取工作中，預設會以 string 類型表示。(注意：您可以將 use_avro_logical_types=False 新增至 Export Data Options 來停用邏輯型別，或是使用旗標 --use_avro_logical_types=True 在擷取工作中啟用邏輯型別)。
在匯出資料 SQL 中，TIME 資料類型預設會以 timestamp-micro 邏輯類型表示 (會標註 Avro LONG 類型)，但在擷取工作中，預設會以 string 類型表示。(注意：您可以將 use_avro_logical_types=False 新增至 Export Data Options 來停用邏輯型別，或是使用標記 --use_avro_logical_types=True 在擷取工作中啟用邏輯型別)。
在匯出資料 SQL 中，DATETIME 資料類型預設會以 Avro STRING 類型 (具有自訂命名邏輯類型 datetime 的字串類型) 表示，但在擷取工作中，預設會以 string 類型表示。(注意：您可以將 use_avro_logical_types=False 新增至 Export Data Options 來停用邏輯型別，或使用旗標 --use_avro_logical_types=True 在 Extract 工作中啟用邏輯型別)。
Avro 匯出功能不支援RANGE 類型。

參數化 NUMERIC(P[, S]) 和 BIGNUMERIC(P[, S]) 資料類型會將精確度和比例類型參數傳送至 Avro 小數邏輯類型。

Avro 格式無法與 GZIP 壓縮搭配使用。如要壓縮 Avro 資料，請使用 bq 指令列工具或 API，然後指定支援壓縮 Avro 資料的類型之一：DEFLATE 或 SNAPPY。

Parquet 匯出詳細資料

BigQuery 會將 GoogleSQL 資料類型轉換為下列 Parquet 資料類型：

BigQuery 資料類型	Parquet 原始類型	Parquet 邏輯類型
整數	`INT64`	`NONE`
數字	`FIXED_LEN_BYTE_ARRAY`	`DECIMAL (precision = 38, scale = 9)`
Numeric(P[, S])	`FIXED_LEN_BYTE_ARRAY`	`DECIMAL (precision = P, scale = S)`
BigNumeric	`FIXED_LEN_BYTE_ARRAY`	`DECIMAL (precision = 76, scale = 38)`
BigNumeric(P[, S])	`FIXED_LEN_BYTE_ARRAY`	`DECIMAL (precision = P, scale = S)`
浮點	`FLOAT`	`NONE`
布林值	`BOOLEAN`	`NONE`
字串	`BYTE_ARRAY`	`STRING` `(UTF8)`
位元組	`BYTE_ARRAY`	`NONE`
日期	`INT32`	`DATE`
日期時間	`INT64`	`TIMESTAMP (isAdjustedToUTC = false, unit = MICROS)`
時間	`INT64`	`TIME (isAdjustedToUTC = true, unit = MICROS)`
時間戳記	`INT64`	`TIMESTAMP (isAdjustedToUTC = false, unit = MICROS)`
地理位置	`BYTE_ARRAY`	`GEOGRAPHY (edges = spherical)`

Parquet 結構定義會將巢狀資料表示為群組，並將重複記錄表示為重複群組。如要進一步瞭解如何在 BigQuery 中使用巢狀和重複的資料，請參閱「指定巢狀和重複的資料欄」。

您可以使用下列解決方法處理 DATETIME 類型：

將檔案載入待用資料表。接著，使用 SQL 查詢將欄位轉換為 DATETIME，並將結果儲存至新資料表。詳情請參閱「變更資料欄的資料類型」。
在載入工作中使用 --schema 旗標，為資料表提供結構定義。將日期時間資料欄定義為 col:DATETIME。

GEOGRAPHY 邏輯型別會以新增至匯出檔案的 GeoParquet 中繼資料表示。

將資料匯出為一或多個檔案

destinationUris 屬性會指出 BigQuery 應匯出檔案的一或多個位置和檔案名稱。

BigQuery 支援在每個 URI 中使用一個萬用字元運算子 (*)。萬用字元可出現在檔案名稱元件的任何位置。使用萬用字元運算子就會指示 BigQuery 根據提供的模式建立多個資料分割檔案。萬用字元運算子會以數字取代 (從 0 開始)，向左填補到到 12 位數。例如，在檔案名稱結尾處使用萬用字元的 URI 建立的檔案，會在第一個檔案結尾附加 000000000000，在第二個檔案結尾附加 000000000001，依此類推。

下表說明 destinationUris 屬性的幾個可能選項：

`destinationUris` 選項
單一 URI	如果您要匯出的資料表資料大小沒有超過 1 GB，請使用單一 URI。這個選項是最常用的情況，因為匯出的資料一般會小於 1 GB 的上限值。 `EXPORT DATA` 陳述式不支援這個選項；您必須使用單一萬用字元 URI。屬性定義： `['gs://my-bucket/file-name.json']` 建立： gs://my-bucket/file-name.json
單一萬用字元 URI	萬用字元只能用於 URI 的檔案名稱元件。如果您認為要匯出的資料會超過 1 GB 的上限值，請使用單一萬用字元 URI。BigQuery 會根據您提供的模式，將資料分割為多個檔案。匯出檔案的大小會有所差異。屬性定義： `['gs://my-bucket/file-name-.json']` 建立：* gs://my-bucket/file-name-000000000000.json gs://my-bucket/file-name-000000000001.json gs://my-bucket/file-name-000000000002.json ... `['gs://my-bucket/']` 建立：* gs://my-bucket/000000000000 gs://my-bucket/000000000001 gs://my-bucket/000000000002 ...

destinationUris 選項

單一 URI

如果您要匯出的資料表資料大小沒有超過 1 GB，請使用單一 URI。這個選項是最常用的情況，因為匯出的資料一般會小於 1 GB 的上限值。 EXPORT DATA 陳述式不支援這個選項；您必須使用單一萬用字元 URI。

屬性定義：

['gs://my-bucket/file-name.json']

建立：

gs://my-bucket/file-name.json

單一萬用字元 URI

萬用字元只能用於 URI 的檔案名稱元件。

如果您認為要匯出的資料會超過 1 GB 的上限值，請使用單一萬用字元 URI。BigQuery 會根據您提供的模式，將資料分割為多個檔案。匯出檔案的大小會有所差異。

屬性定義：

['gs://my-bucket/file-name-*.json']

建立：

gs://my-bucket/file-name-000000000000.json
gs://my-bucket/file-name-000000000001.json
gs://my-bucket/file-name-000000000002.json
...

['gs://my-bucket/*']

建立：

gs://my-bucket/000000000000
gs://my-bucket/000000000001
gs://my-bucket/000000000002
...

限制匯出檔案大小

如果單次匯出超過 1 GB 的資料，您必須使用萬用字元將資料匯出至多個檔案，且檔案大小會有所差異。如果您需要限制每個匯出檔案的大小上限，可以考慮隨機分割資料，然後將每個分割區匯出至檔案：

請判斷所需的分區數量，這等於資料總大小除以所選的匯出檔案大小。舉例來說，如果您有 8,000 MB 的資料，且希望每個匯出的檔案約為 20 MB，就需要 400 個區隔。

建立新資料表，並以隨機產生的名為 export_id 的新欄進行分區和叢集。以下範例說明如何從名為 source_table 的現有資料表建立新的 processed_table，該資料表需要 n 分區才能達到所選的檔案大小：

CREATE TABLE my_dataset.processed_table
PARTITION BY RANGE_BUCKET(export_id, GENERATE_ARRAY(0, n, 1))
CLUSTER BY export_id
AS (
  SELECT *, CAST(FLOOR(n*RAND()) AS INT64) AS export_id
  FROM my_dataset.source_table
);

針對 0 到 n-1 之間的每個整數 i，請針對以下查詢執行 EXPORT DATA 陳述式：
```
SELECT * EXCEPT(export_id)
FROM my_dataset.processed_table
WHERE export_id = i;
```

擷取壓縮表

Go

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Go 設定說明進行操作。詳情請參閱 BigQuery Go API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

import (
	"context"
	"fmt"

	"cloud.google.com/go/bigquery"
)

// exportTableAsCompressedCSV demonstrates using an export job to
// write the contents of a table into Cloud Storage as compressed CSV.
func exportTableAsCompressedCSV(projectID, gcsURI string) error {
	// projectID := "my-project-id"
	// gcsURI := "gs://mybucket/shakespeare.csv"
	ctx := context.Background()
	client, err := bigquery.NewClient(ctx, projectID)
	if err != nil {
		return fmt.Errorf("bigquery.NewClient: %w", err)
	}
	defer client.Close()

	srcProject := "bigquery-public-data"
	srcDataset := "samples"
	srcTable := "shakespeare"

	gcsRef := bigquery.NewGCSReference(gcsURI)
	gcsRef.Compression = bigquery.Gzip

	extractor := client.DatasetInProject(srcProject, srcDataset).Table(srcTable).ExtractorTo(gcsRef)
	extractor.DisableHeader = true
	// You can choose to run the job in a specific location for more complex data locality scenarios.
	// Ex: In this example, source dataset and GCS bucket are in the US.
	extractor.Location = "US"

	job, err := extractor.Run(ctx)
	if err != nil {
		return err
	}
	status, err := job.Wait(ctx)
	if err != nil {
		return err
	}
	if err := status.Err(); err != nil {
		return err
	}
	return nil
}

Java

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Java 設定說明進行操作。詳情請參閱 BigQuery Java API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

import com.google.cloud.bigquery.BigQuery;
import com.google.cloud.bigquery.BigQueryException;
import com.google.cloud.bigquery.BigQueryOptions;
import com.google.cloud.bigquery.ExtractJobConfiguration;
import com.google.cloud.bigquery.Job;
import com.google.cloud.bigquery.JobInfo;
import com.google.cloud.bigquery.TableId;

// Sample to extract a compressed table
public class ExtractTableCompressed {

  public static void main(String[] args) {
    // TODO(developer): Replace these variables before running the sample.
    String projectName = "MY_PROJECT_NAME";
    String datasetName = "MY_DATASET_NAME";
    String tableName = "MY_TABLE_NAME";
    String bucketName = "MY-BUCKET-NAME";
    String destinationUri = "gs://" + bucketName + "/path/to/file";
    // For more information on export formats available see:
    // https://cloud.google.com/bigquery/docs/exporting-data#export_formats_and_compression_types
    String compressed = "gzip";
    // For more information on Job see:
    // https://googleapis.dev/java/google-cloud-clients/latest/index.html?com/google/cloud/bigquery/package-summary.html
    String dataFormat = "CSV";

    extractTableCompressed(
        projectName, datasetName, tableName, destinationUri, dataFormat, compressed);
  }

  public static void extractTableCompressed(
      String projectName,
      String datasetName,
      String tableName,
      String destinationUri,
      String dataFormat,
      String compressed) {
    try {
      // Initialize client that will be used to send requests. This client only needs to be created
      // once, and can be reused for multiple requests.
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();

      TableId tableId = TableId.of(projectName, datasetName, tableName);

      ExtractJobConfiguration extractConfig =
          ExtractJobConfiguration.newBuilder(tableId, destinationUri)
              .setCompression(compressed)
              .setFormat(dataFormat)
              .build();

      Job job = bigquery.create(JobInfo.of(extractConfig));

      // Blocks until this job completes its execution, either failing or succeeding.
      Job completedJob = job.waitFor();
      if (completedJob == null) {
        System.out.println("Job not executed since it no longer exists.");
        return;
      } else if (completedJob.getStatus().getError() != null) {
        System.out.println(
            "BigQuery was unable to extract due to an error: \n" + job.getStatus().getError());
        return;
      }
      System.out.println("Table extract compressed successful");
    } catch (BigQueryException | InterruptedException e) {
      System.out.println("Table extraction job was interrupted. \n" + e.toString());
    }
  }
}

Node.js

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Node.js 設定說明進行操作。詳情請參閱 BigQuery Node.js API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

// Import the Google Cloud client libraries
const {BigQuery} = require('@google-cloud/bigquery');
const {Storage} = require('@google-cloud/storage');

const bigquery = new BigQuery();
const storage = new Storage();

async function extractTableCompressed() {
  // Exports my_dataset:my_table to gcs://my-bucket/my-file as a compressed file.

  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
  // const datasetId = "my_dataset";
  // const tableId = "my_table";
  // const bucketName = "my-bucket";
  // const filename = "file.csv";

  // Location must match that of the source table.
  const options = {
    location: 'US',
    gzip: true,
  };

  // Export data from the table into a Google Cloud Storage file
  const [job] = await bigquery
    .dataset(datasetId)
    .table(tableId)
    .extract(storage.bucket(bucketName).file(filename), options);

  console.log(`Job ${job.id} created.`);

  // Check the job's status for errors
  const errors = job.status.errors;
  if (errors && errors.length > 0) {
    throw errors;
  }
}

Python

在嘗試這個範例之前，請先按照 BigQuery 快速入門：使用用戶端程式庫中的 Python 設定說明進行操作。詳情請參閱 BigQuery Python API 參考說明文件。

如要向 BigQuery 進行驗證，請設定應用程式預設憑證。詳情請參閱「設定用戶端程式庫的驗證機制」。

# from google.cloud import bigquery
# client = bigquery.Client()
# bucket_name = 'my-bucket'

destination_uri = "gs://{}/{}".format(bucket_name, "shakespeare.csv.gz")
dataset_ref = bigquery.DatasetReference(project, dataset_id)
table_ref = dataset_ref.table("shakespeare")
job_config = bigquery.job.ExtractJobConfig()
job_config.compression = bigquery.Compression.GZIP

extract_job = client.extract_table(
    table_ref,
    destination_uri,
    # Location must match that of the source table.
    location="US",
    job_config=job_config,
)  # API request
extract_job.result()  # Waits for job to complete.

用途範例

本例說明如何將資料匯出至 Cloud Storage。

假設您會持續從端點記錄串流傳輸資料至 Cloud Storage。每天會將快照匯出至 Cloud Storage，以便備份和封存。最佳做法是使用擷取工作，但這類工作會受到特定配額和限制的影響。

使用API 或用戶端程式庫提交擷取工作，並將專屬 ID 傳入 jobReference.jobId。擷取工作為非同步作業。使用建立工作時使用的專屬工作 ID，查看工作狀態。如果 status.status 是 DONE，表示工作已順利完成。如果有 status.errorResult，表示工作失敗，需要重試。

批次資料處理

假設您使用夜間批次工作，在固定期限內載入資料。這項載入工作完成後，系統會根據前述的查詢，從統計資料中建立資料表。系統會擷取這個資料表中的資料，並將其編譯為 PDF 報表，再傳送給監管機構。

由於需要讀取的資料量很少，請使用 tabledata.list API，以 JSON 字典格式擷取資料表的所有資料列。如果資料有超過一頁，結果就會設定 pageToken 屬性。如要擷取下一頁的結果，請再次發出 tabledata.list 呼叫，並將符記值加入 pageToken 參數中。如果 API 呼叫失敗並顯示 5xx 錯誤，請以指數輪詢方式重試。大部分的 4xx 錯誤無法重試。為進一步分離 BigQuery 匯出作業和報表產生作業，請將結果儲存至磁碟。

配額政策

如要瞭解匯出工作配額，請參閱「配額與限制」頁面的匯出工作一節。

匯出工作用途資訊會顯示在 INFORMATION_SCHEMA 中。匯出工作的 JOBS_BY_* 系統表中的工作項目包含 total_bytes_processed 值，可用於監控總用量，確保每天的用量不超過 50 TiB。如要瞭解如何查詢 INFORMATION_SCHEMA.JOBS 檢視區塊以取得 total_bytes_processed 值，請參閱 INFORMATION_SCHEMA.JOBS 結構定義

查看目前的配額用量

您可以執行 INFORMATION_SCHEMA 查詢，查看在指定時間範圍內執行的工作的中繼資料，瞭解目前查詢、載入、擷取或複製工作使用的情況。您可以比較目前的使用量與配額限制，判斷特定類型工作所使用的配額。以下範例查詢會使用 INFORMATION_SCHEMA.JOBS 檢視畫面，依專案列出查詢、載入、擷取和複製工作數量：

SELECT
  sum(case  when job_type="QUERY" then 1 else 0 end) as QRY_CNT,
  sum(case  when job_type="LOAD" then 1 else 0 end) as LOAD_CNT,
  sum(case  when job_type="EXTRACT" then 1 else 0 end) as EXT_CNT,
  sum(case  when job_type="COPY" then 1 else 0 end) as CPY_CNT
FROM `region-REGION_NAME`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE date(creation_time)= CURRENT_DATE()

您可以設定 Cloud Monitoring 快訊政策，通知匯出多少位元組。

在 Google Cloud 控制台中，前往「Monitoring」頁面。
前往「Monioring」
在導覽窗格中，選取「Metrics Explorer」。

在 MQL 查詢編輯器中，設定快訊來監控每天匯出的位元組數，如以下範例所示：

fetch consumer_quota
  | filter resource.service == 'bigquery.googleapis.com'
  | { metric serviceruntime.googleapis.com/quota/rate/net_usage
      | align delta_gauge(1m)
      | group_by [resource.project_id, metric.quota_metric, resource.location],
          sum(value.net_usage)
    ; metric serviceruntime.googleapis.com/quota/limit
      | filter metric.limit_name == 'ExtractBytesPerDay'
      | group_by [resource.project_id, metric.quota_metric, resource.location],
          sliding(1m), max(val()) }
  | ratio
  | every 1m
  | condition gt(val(), 0.01 '1')

如要設定快訊，請按一下「執行查詢」。

詳情請參閱「使用 MQL 的快訊政策」。

疑難排解

如要診斷擷取工作的問題，您可以使用記錄檔瀏覽器查看特定擷取工作的記錄檔，並找出可能的錯誤。下列「Logs Explorer」(記錄檔探索工具) 篩選器會傳回擷取工作相關資訊：

resource.type="bigquery_resource"
protoPayload.methodName="jobservice.insert"
(protoPayload.serviceData.jobInsertRequest.resource.jobConfiguration.query.query=~"EXPORT" OR
protoPayload.serviceData.jobCompletedEvent.eventName="extract_job_completed" OR
protoPayload.serviceData.jobCompletedEvent.job.jobConfiguration.query.query=~"EXPORT")

定價

如要瞭解資料匯出價格，請參閱 BigQuery 定價頁面。

匯出資料之後，系統會因您在 Cloud Storage 中儲存資料而向您收取費用。詳情請參閱 Cloud Storage 定價。

表格安全性

如要控管 BigQuery 中資料表的存取權，請參閱「使用 IAM 控管資源存取權」。

後續步驟

如要進一步瞭解 Google Cloud 控制台，請參閱「使用 Google Cloud 控制台」一文。
如要進一步瞭解 bq 指令列工具，請參閱「使用 bq 指令列工具」。
如要瞭解如何使用 Google BigQuery API 用戶端程式庫來建立應用程式，請參閱用戶端程式庫快速入門導覽課程。