Stay organized with collections
Save and categorize content based on your preferences.
You can customize the runtime environment of user code in Dataflow
pipelines by supplying a custom container image. Custom containers are
supported for pipelines that use Dataflow
Runner v2.
When Dataflow starts up worker VMs, it uses Docker container
images to launch containerized SDK processes on the workers. By default, a
pipeline uses a prebuilt
Apache Beam image.
However, you can provide a custom container image for your Dataflow job.
When you specify a custom container image, Dataflow launches workers
that pull the specified image.
You might use a custom container for the following reasons:
Preinstall pipeline dependencies to reduce worker start time.
Preinstall pipeline dependencies that are not available in
public repositories.
Preinstall pipeline dependencies when access to public repositories is
turned off. Access might be turned off for security reasons.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-03-21 UTC."],[[["Dataflow pipelines using Runner v2 support the use of custom container images to customize the runtime environment of user code."],["By default, Dataflow pipelines use prebuilt Apache Beam images, but users can specify their own custom container images for their Dataflow jobs."],["Custom containers allow users to preinstall pipeline dependencies, including those not in public repositories, and to manage dependencies when access to public repositories is restricted."],["Using custom containers also allows you to prestage large files and launch third-party software to customize the execution environment."],["The main use cases of custom containers are to reduce worker start time, customize the environment, and to manage dependencies."]]],[]]