Use Image streaming to reduce container startup time

This document describes how to use Image streaming to pull container images into Batch container jobs.

Image streaming enables Batch jobs to initialize without waiting for a container image to finish downloading, which provides the following benefits:

  • Reduced latency when pulling large images
  • Faster time to begin job execution

Before you begin

  1. If you haven't used Batch before, review Get started with Batch and enable Batch by completing the prerequisites for projects and users.
  2. To get the permissions that you need to create a job, ask your administrator to grant you the following IAM roles:

    For more information about granting roles, see Manage access to projects, folders, and organizations.

    You might also be able to get the required permissions through custom roles or other predefined roles.

  3. If you haven't already done so, enable the Container File System API by running the following command:

    gcloud services enable containerfilesystem.googleapis.com
    
  4. If your container images are protected by VPC Service Controls, update your service perimeter to include containerfilesystem.googleapis.com.

Limitations

Batch Image streaming has the following limitations:

  • Batch only supports Image streaming for container images that are stored in Artifact Registry. If you currently use Container Registry to manage your container images, you can transition to Artifact Registry.
  • You must run your Batch job's VMs in the same location as where you store your container image in Artifact Registry.
  • Containers that use the Docker image manifest version 2, schema 1 aren't supported.
  • When you use Image streaming, containers runnables only support the following fields:
    • imageUri
    • commands
    • entrypoint
    • volumes
    • enableImageStreaming
  • Container images with empty layers or duplicate layers aren't supported.

Create a job that uses Image streaming

Create a Batch container job that uses Image streaming by doing the following:

Use the Google Cloud CLI or REST API to create a container job. To enable Image streaming for a container runnable, set enableImageStreaming field to true and set the imageUri field to an image that is stored in an Artifact Registry location that contains the location of the job's VM.

"container": {
    ...
    "enableImageStreaming": true
        }

For example, a job that uses Image streaming would have a JSON configuration file similar to the following:

{
    "taskGroups": [
        {
            "taskCount": "1",
            "taskCountPerNode": "1",
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            "imageUri": "LOCATION-docker.pkg.dev/PROJECT_ID/REPOSITORY/IMAGE:TAG",
                            "enableImageStreaming": true
                        }
                    }
                ]
            }
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": {
                    "machineType": "e2-standard-4"
                }
            }
        ]
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

Replace the following values:

  • LOCATION: the regional or multi-regional location of the repository where the image is stored, for example us-east1 or us. The location of the container must be the same as the location of the Batch job's VMs.
  • PROJECT-ID: the project that contains the container image. If your project ID contains a colon (:), see Domain-scoped projects.
  • REPOSITORY: the name of the repository where the image is stored.
  • IMAGE: the name of the container image.
  • TAG: the tag applied to the image.

What's next