[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-08-18。"],[[["\u003cp\u003eDataflow jobs should be run in the same region as their data sources and sinks to minimize network latency and transport costs, and starting with Beam SDK version 2.44.0, jobs must have workers in the same region as the job region.\u003c/p\u003e\n"],["\u003cp\u003eDataflow offers regional placement for jobs using Streaming Engine or Dataflow Shuffle, which improves resource availability and reliability by utilizing multiple zones within the specified region.\u003c/p\u003e\n"],["\u003cp\u003eFor jobs not supported by regional placement, the system automatically selects the best zone within the region based on available capacity at job creation.\u003c/p\u003e\n"],["\u003cp\u003eWhile it is possible to override the worker zone to run Dataflow workers in a specific zone, it is generally not recommended, and can negatively impact performance and costs, however it can be done by specifying both the region and the specific zone using the \u003ccode\u003e--workerZone\u003c/code\u003e or \u003ccode\u003e--worker_zone\u003c/code\u003e options.\u003c/p\u003e\n"],["\u003cp\u003eBy default, Dataflow pipeline log messages are globally stored, however for more control they can be redirected to a log bucket within a chosen region.\u003c/p\u003e\n"]]],[],null,["# Dataflow regions\n\nThe Dataflow region stores and handles metadata about your\nDataflow job and deploys and controls your Dataflow\nworkers.\n\nRegion names follow a standard convention based on\n[Compute Engine region names](/compute/docs/regions-zones/regions-zones#available).\nFor example, the name for the Central US region is `us-central1`.\n\nThis feature is available in all regions where Dataflow is supported. To see available locations, read [Dataflow locations](/dataflow/docs/resources/locations).\n\nGuidelines for choosing a region\n--------------------------------\n\nUse the following guidelines to choose an appropriate region for your job.\n\n### Security and compliance\n\nYou might need to constrain Dataflow job processing to a specific\ngeographic region in support of the security and compliance needs of your project.\n\n### Data locality\n\nYou can minimize network latency and network transport costs by running a\nDataflow job from the same region as its sources, sinks, staging file locations,\nand temporary file locations. If you use sources, sinks, staging file locations,\nor temporary file locations that are located *outside* of your job's region,\nyour data might be sent across regions.\n| **Note:** Starting with Beam SDK version 2.44.0, Dataflow does not support running jobs with workers in a region that is different from the job region.\n\nIn running a pipeline, user data is only handled by the Dataflow worker pool\nand the movement of the data is restricted to the network paths that connect\nthe Dataflow workers in the pool.\n\nAlthough user data is strictly handled by Dataflow workers in their\nassigned geographic region, pipeline log messages are stored in\n[Cloud Logging](/logging/docs), which has a single global presence in\nGoogle Cloud.\n\nIf you need more control over the location of pipeline log messages, you can do the following:\n\n1. [Create an exclusion filter](/logging/docs/exclusions#create-filter-existing) for the `_Default` log router sink to prevent Dataflow logs from being exported to the `_Default` log bucket.\n2. [Create a log bucket](/logging/docs/buckets#create_bucket) in the region of your choice.\n3. Configure a new log router sink that exports your Dataflow logs to your new log bucket.\n\nTo learn more about configuring logging, see\n[Routing and storage overview](/logging/docs/routing/overview)\nand [Log routing overview](/logging/docs/export).\n\nNotes about common Dataflow job sources:\n\n- When using a Cloud Storage bucket as a source, we recommend that you perform **read** operations in the same [region as the bucket](/storage/docs/bucket-locations).\n- [Pub/Sub](/pubsub/architecture) topics, when published to the global Pub/Sub endpoint, are stored in the nearest Google Cloud region. However, you can modify the topic storage policy to a specific [region or a set of regions](/pubsub/docs/resource-location-restriction). Similarly, [Pub/Sub Lite](/pubsub/docs/choosing-pubsub-or-lite#comparison_table) topics support only zonal storage.\n\n### Resilience and geographic separation\n\nYou might want to isolate your normal Dataflow operations from\noutages that could occur in other [geographic regions](/docs/geography-and-regions).\nOr, you might need to plan alternate sites for business continuity in the event\nof a region-wide disaster.\n\nIn your [disaster recovery and business continuity plans](/solutions/designing-a-disaster-recovery-plan),\nwe recommend incorporating details for sources and sinks used with your\nDataflow jobs. The [Google Cloud sales team](/contact) can\nhelp you work towards meeting your requirements.\n\nRegional placement\n------------------\n\nBy default, the region that you select configures the Dataflow\nworker pool to utilize all available zones within the region. Zone selection is\ncalculated for each worker at its creation time, optimizing for resource\nacquisition and utilization of unused\n[reservations](/compute/docs/instances/reserving-zonal-resources).\n\nRegional placement offers benefits such as:\n\n- Improved resource availability: Dataflow jobs are more resilient to [zonal resource availability](/compute/docs/troubleshooting/troubleshooting-vm-creation#resource_availability) errors, because workers can continue to be created in other zones with remaining availability.\n- Improved reliability: In the event of a zonal failure, Dataflow jobs can continue to run, because workers are recreated in other zones.\n\nThe following limitations apply:\n\n- Regional placement is supported only for jobs using Streaming Engine or Dataflow Shuffle. Jobs that have opted out of Streaming Engine or Dataflow Shuffle cannot use regional placement.\n- Regional placement applies to VMs only, and doesn't apply to backend resources.\n- VMs are not replicated across multiple zones. If a VM becomes unavailable, for example, its work items are considered lost and are reprocessed by another VM.\n- If a region-wide stockout occurs, the Dataflow service cannot create any more VMs.\n- If a zone-wide stockout occurs in one or more zones in the configured region, the Dataflow service might fail to start a job.\n\nView job resource zones\n-----------------------\n\nDataflow jobs depend on internal resources. Some of these\nbackend job resources are zonal. If a single zone fails and a zonal resource necessary\nfor your Dataflow job is in that zone, the job might fail.\n\nTo understand whether a job failed because of a zonal outage,\nreview the service zones that your job's backend resources are using.\nThis feature is only available\nfor Streaming Engine jobs.\n\n- To view the service zones in the\n Google Cloud console, use the **Service zones** field in the\n **Job info** panel.\n\n- To use the API to review the service zones, use the\n [`ServiceResources`](/dataflow/docs/reference/rest/v1b3/projects.jobs#Job.ServiceResources)\n field.\n\nThe values in this field update throughout the duration of\nthe job, because the resources that the job uses change while the job runs.\n\nAutomatic zone placement\n------------------------\n\nFor jobs not supported for regional placement, the best zone within the region\nis selected automatically, based on the available\nzone capacity at the time of the job creation request. Automatic zone selection\nhelps ensure that job workers run in the best zone for your job.\n\nBecause the job is configured to run in a single zone, the operation might fail\nwith a\n[zonal resource availability](/compute/docs/troubleshooting/troubleshooting-vm-creation#resource_availability)\nerror if sufficient Compute Engine resources are not available.\nIf a stock out occurs in a region, you might see a\n[`ZONE_RESOURCE_POOL_EXHAUSTED`](/dataflow/docs/guides/common-errors#worker-pool-failure)\nerror. You can implement a retry loop to start the job when resources are\navailable.\n\nAlso, when a zone is unavailable, the streaming backend can also become\nunavailable, which might result in data loss.\n\nSpecify a region\n----------------\n\n| **Note:** Region configuration requires Apache Beam SDK version 2.0.0 or higher.\n\nTo specify a region for your job, set the `--region` option to one of\nthe [**supported**](/dataflow/docs/resources/locations) regions.\nThe `--region` option overrides the default region that is set in the metadata\nserver, your local client, or the environment variables.\n\nThe [Dataflow command-line interface](/dataflow/pipelines/dataflow-command-line-intf)\nalso supports the `--region` option to specify regions.\n\nOverride the worker region or zone\n----------------------------------\n\nBy default, when you submit a job with the `--region` option,\nworkers are automatically assigned to\neither [zones across the region](#regional_placement) or the\n[single best zone](#autozone) within the region, depending on the job type.\n\nIn cases where you want to ensure that the workers for your\nDataflow job run strictly in a specific zone, you can specify\nthe zone using the following\n[pipeline option](/dataflow/docs/reference/pipeline-options#worker-level_options).\nThis usage pattern is uncommon for Dataflow jobs.\n\nThis option only controls the zone used for the Dataflow workers.\nIt doesn't apply to backend\nresources. Backend resources might be created in any zone within the job region. \n\n### Java\n\n --workerZone\n\n### Python\n\n --worker_zone\n\n### Go\n\n --worker_zone\n\nFor all other cases, we don't recommend overriding the worker location. The\n[common scenarios table](#commonscenarios) contains usage recommendations\nfor these situations.\n\nBecause the job is configured to run in a single zone, the operation might fail\nwith a\n[zonal resource availability](/compute/docs/troubleshooting/troubleshooting-vm-creation#resource_availability)\nerror if sufficient Compute Engine resources are not available.\n| **Caution:** If you override the worker zone and the workers are in a different region than the job region, there might be a negative impact on performance, network traffic, network latency, and network cost.\n\nYou can run the `gcloud compute regions list` command to see a listing of\nregions and zones that are available for worker deployment.\n\nCommon scenarios\n----------------\n\nThe following table contains usage recommendations for common scenarios."]]