Use Datastream to ingest data into partitioned tables in BigQuery
Stay organized with collections
Save and categorize content based on your preferences.
You might need to partition your BigQuery tables into smaller segments to
improve query performance and control costs. Because Datastream doesn't
support partitioning tables in BigQuery, you need to manually add the
partitions before starting your stream. For general information about partitioning
in BigQuery, see Introduction to partitioned tables.
Partition tables in BigQuery
To partition your tables in BigQuery, use one of the options described
in the sections that follow, depending on your use case.
Option 1: The table already exists in BigQuery and is included in a stream
Exclude the table from the source configuration of your stream. For more
information about including and excluding objects from your source configuration,
see Configure source databases.
Wait a few minutes to ensure that Datastream has completed processing
all events for the table.
Create your partitioned table in BigQuery.
If you want to keep the data that was already in the original BigQuery
table, give the table a different, temporary name.
Copy the data from the original table to the new partitioned table.
Drop or rename the original table.
Change the temporary name of the new table to the name of the original table.
Add the source table to the configuration of your stream.
Option 2: The table doesn't exist in BigQuery
Create the table in BigQuery using one of the following approaches:
Manually create a Datastream-compatible BigQuery table. For
example, if you want to create a table and partition the data based on the
TIMESTAMP column, you can use a query similar to the following:
After you create the partitioned table, make sure that its max_staleness value
is set according to your requirements. If you don't set the value, the default
value of 0 is set. Leaving this value at 0 ensures the freshest data, but
incurs a significant cost. For information about how to find the optimal value
for your table, see Use BigQuery tables with the max_staleness option.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-25 UTC."],[[["\u003cp\u003eBigQuery table partitioning is necessary for improved query performance and cost control, but Datastream does not automatically support it.\u003c/p\u003e\n"],["\u003cp\u003eFor tables already in BigQuery and included in a stream, you must exclude the table from the stream, create a partitioned table, copy data, and then re-add the table to the stream's configuration.\u003c/p\u003e\n"],["\u003cp\u003eIf the table is not yet in BigQuery, you can create a partitioned table manually or using the BigQuery Migration Toolkit, ensuring to set the \u003ccode\u003emax_staleness\u003c/code\u003e value, and then add it to your stream configuration.\u003c/p\u003e\n"],["\u003cp\u003eWhen creating partitioned tables manually, ensure that you define the correct data types and specify the \u003ccode\u003ePARTITION BY\u003c/code\u003e clause in your query, and create your primary key.\u003c/p\u003e\n"]]],[],null,["# Use Datastream to ingest data into partitioned tables in BigQuery\n\nYou might need to partition your BigQuery tables into smaller segments to\nimprove query performance and control costs. Because Datastream doesn't\nsupport partitioning tables in BigQuery, you need to manually add the\npartitions before starting your stream. For general information about partitioning\nin BigQuery, see [Introduction to partitioned tables](/bigquery/docs/partitioned-tables).\n\nPartition tables in BigQuery\n----------------------------\n\nTo partition your tables in BigQuery, use one of the options described\nin the sections that follow, depending on your use case.\n\n### Option 1: The table already exists in BigQuery and is included in a stream\n\n1. Exclude the table from the source configuration of your stream. For more information about including and excluding objects from your source configuration, see [Configure source databases](/datastream/docs/create-a-stream#configuresourcedb).\n2. Wait a few minutes to ensure that Datastream has completed processing all events for the table.\n3. [Create your partitioned table in BigQuery](/bigquery/docs/creating-partitioned-tables). If you want to keep the data that was already in the original BigQuery table, give the table a different, temporary name.\n4. Copy the data from the original table to the new partitioned table.\n5. Drop or rename the original table.\n6. Change the temporary name of the new table to the name of the original table.\n7. Add the source table to the configuration of your stream.\n\n### Option 2: The table doesn't exist in BigQuery\n\n1. Create the table in BigQuery using one of the following approaches:\n\n - Use the [BigQuery Migration Toolkit](/datastream/docs/best-practices-migration-toolkit).\n - Manually create a Datastream-compatible BigQuery table. For\n example, if you want to create a table and partition the data based on the\n `TIMESTAMP` column, you can use a query similar to the following:\n\n CREATE TABLE dataset.partitioned_table (\n 'id' INT64,\n 'name' STRING\n 'update_date' DATETIME,\n 'datastream_metadata' STRUCT\u003c'uuid' STRING, 'source_timestamp' INT64\u003e,\n PRIMARY KEY ('id') NOT ENFORCED\n )\n PARTITION BY TIMESTAMP(update_date)\n\n2. After you create the partitioned table, make sure that its `max_staleness` value\n is set according to your requirements. If you don't set the value, the default\n value of `0` is set. Leaving this value at `0` ensures the freshest data, but\n incurs a significant cost. For information about how to find the optimal value\n for your table, see [Use BigQuery tables with the `max_staleness` option](/datastream/docs/destination-bigquery#use-max-staleness).\n\n3. [Add the source table to the configuration of your stream](/datastream/docs/create-a-stream#configuresourcedb).\n\n4. Optionally, if you've set manual backfill for the stream, initiate backfill\n for the table."]]