Configure container health checks for services

You can configure HTTP, TCP, and gRPC startup probes, along with HTTP and gRPC liveness probes for new and existing Cloud Run services. The configuration varies depending on the type of probe.

Note that a TCP startup probe is automatically configured for a new Cloud Run service. See The default TCP startup probe for details.

Use cases

You can configure two types of health check probes:

  • Liveness probes determine whether to restart a container.

    • Restarting a container in this case can increase service availability in the event of bugs.
    • Liveness probes are intended to restart individual instances that can't be recovered in any other way. They should be used primarily for unrecoverable instance failures, for example, to catch a deadlock where a service is running, but unable to make progress. You can require a liveness probe for every container by using custom organization policies.
  • Startup probes determine whether the container has started and is ready to accept traffic.

    • When you configure a startup probe, liveness checks are disabled until the startup probe determines that the container is started, to prevent interference with the service startup.
    • Startup probes are especially useful if you use liveness checks on slow starting containers, because it prevents them from being shut down prematurely before the containers are up and running.

Note that when a service experiences repeated startup or liveness probe failures, Cloud Run limits instance restarts to prevent uncontrolled crash loops.

The default TCP startup probe

A TCP startup probe is automatically configured for a new Cloud Run service with default values. The default probe is equivalent to the following:

startupProbe:
  tcpSocket:
    port: CONTAINER_PORT
  timeoutSeconds: 240
  periodSeconds: 240
  failureThreshold: 1

Replace CONTAINER_PORT with the container port set for your service.

You can change these default values following the instructions in the probe configuration section on this page.

Billing options

  • You must use instance-based billing for every probe.
  • All probes are billed for CPU and memory usage consumption, but there is no request-based charge.

Probe requirements and behavior

Probe Type Requirements Behavior
TCP startup None By default, Cloud Run makes a TCP connection to open the TCP Socket on the specified port. If Cloud Run is unable to establish a connection, it indicates a failure.

If a startup probe does not succeed within the specified time (failureThreshold * periodSeconds), which cannot exceed 240 seconds, the container is shut down. See also TCP defaults.
HTTP startup Create an HTTP health check endpoint
Use HTTP/1
After probe configuration, Cloud Run makes an HTTP GET request to the service health check endpoint (for example, /ready). Any response between 200 and 400 is a success, everything else indicates failure.

If a startup probe does not succeed within the specified time (failureThreshold * periodSeconds), which cannot exceed 240 seconds, the container is shut down.

If the HTTP startup probe succeeds within the specified time, and you have configured an HTTP liveness probe, the HTTP liveness probe is started.
HTTP liveness Create an HTTP health check endpoint
Use HTTP/1
The liveness probe starts only after the startup probe is successful. After probe configuration, and any startup probe is successful, Cloud Run makes an HTTP GET request to the service health check endpoint (for example, /health). Any response between 200 and 400 is a success, everything else indicates failure.

If a liveness probe does not succeed within the specified time (failureThreshold * periodSeconds), the container is shut down using a SIGKILL signal. Any remaining requests that were still being served by the container are terminated with the HTTP status code 503. After the container is shut down, Cloud Run autoscaling starts up a new container instance.
gRPC startup Implement the gRPC Health Checking protocol in your Cloud Run service If a startup probe does not succeed within the specified time (failureThreshold * periodSeconds), which cannot exceed 240 seconds, the container is shut down.
gRPC liveness Implement the gRPC Health Checking protocol in your Cloud Run service If you configure a gRPC startup probe, the liveness probe starts only after the startup probe is successful.

After the liveness probe is configured, and any startup probe is successful, Cloud Run makes a health check request to the service.

If a liveness probe does not succeed within the specified time (failureThreshold * periodSeconds), the container is shut down using a SIGKILL signal. After the container is shut down, Cloud Run autoscaling starts up a new container instance.

Configure probes

Any configuration change leads to the creation of a new revision. Subsequent revisions will also automatically get this configuration setting unless you make explicit updates to change it.

You can configure HTTP, TCP, and gRPC probes using Google Cloud console, YAML, or Terraform:

Console

Important: If you are configuring your Cloud Run service for HTTP probes, you must also add an HTTP health check endpoint in your service code to respond to the probe. If you are configuring a gRPC probe, you must also implement the gRPC Health Checking protocol in your Cloud Run service.

  1. In the Google Cloud console, go to the Cloud Run page.

    Go to Cloud Run

  2. For a new service, expand Container(s), volumes, networking, security to display the health check options. For an existing service, click the service you want to configure, then click Edit and deploy to display the health check options.

  3. In the Container(s) section, go to Health checks and click Add health check to open the Add health check configuration panel.

  4. From the Select health check type menu, select the type of health check you want to add, for example, startup or liveness.

  5. From the Select probe type menu, select the type of the probe you want to use, for example, HTTP or gRPC. This displays the probe configuration form.

  6. Note that probe configuration varies by probe type. Configure the probe settings:

    • If you are using HTTP probes:
      • Make sure your service uses HTTP/1 (the Cloud Run default), not HTTP/2.
      • Use the Path field to specify the relative path to the endpoint, for example, /.
      • Select the HTTP Headers checkbox to specify optional custom headers. Then specify the header name in the Name field and header value in the Value field. Click Add HTTP header to specify more headers.
    • For Port, specify the container port used for your service.
    • For Initial delay, specify the number of seconds to wait after the container has started before performing the first probe. Specify a value from 0 seconds to 240 seconds. The default value is 0 seconds.
    • For Period, specify the period (in seconds) at which to perform the probe. For example 2 to perform the probe every 2 seconds. Specify a value from 1 second to 240 seconds. The default value is 10 seconds.
    • For Failure threshold, specify the number of times to retry the probe before shutting down the container. The default value is 3.
    • For Timeout, specify the number of seconds to wait until the probe times out. This value cannot exceed the value specified for periodSeconds. Specify a value from 1 to 240. The default is 1.
  7. Click Add to add the new threshold

YAML

Important: If you are configuring your Cloud Run service for HTTP probes, you must also add an endpoint in your service code to respond to the probe. If you are configuring a gRPC probe, you must also implement the gRPC Health Checking protocol in your Cloud Run service.

TCP startup

  1. If you are creating a new service, skip this step. If you are updating an existing service, download its YAML configuration:
    gcloud run services describe SERVICE --format export > service.yaml
  2. Configure the startupProbe attribute as shown:

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
     name: SERVICE
    spec:
     template:
       metadata:
       spec:
         containers:
         - image: IMAGE_URL
           startupProbe:
             tcpSocket:
               port: CONTAINER_PORT
             initialDelaySeconds: DELAY
             timeoutSeconds: TIMEOUT
             failureThreshold: THRESHOLD
             periodSeconds: PERIOD

    Replace

    • SERVICE with the name of your Cloud Run service.
    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • (OPTIONAL) CONTAINER_PORT should be set to the container port used for your service.
    • DELAY with number of seconds to wait after the container has started before performing the first probe. Specify a value from 0 seconds to 240 seconds. The default value is 0 seconds.
    • (OPTIONAL) TIMEOUT with the number of seconds to wait until the probe times out. This value cannot exceed the value specified for periodSeconds. Specify a value from 1 to 240. The default is 1.
    • THRESHOLD with the number of times to retry the probe before shutting down the container. The default value is 3.
    • PERIOD with the period (in seconds) at which to perform the probe. For example 2 to perform the probe every 2 seconds. Specify a value from 1 second to 240 seconds. The default value is 10 seconds.
  3. Create or update the service using the following command:
    gcloud run services replace service.yaml

HTTP startup

  1. If you are creating a new service, skip this step. If you are updating an existing service, download its YAML configuration:
    gcloud run services describe SERVICE --format export > service.yaml
  2. Make sure your service uses HTTP/1 (the Cloud Run default), not HTTP/2.

  3. Configure the startupProbe attribute as shown:

    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: SERVICE
    spec:
      template:
        metadata:
        spec:
          containers:
          - image: IMAGE_URL
            startupProbe:
              httpGet:
                path: PATH
                port: CONTAINER_PORT
                httpHeaders:
                  - name: HEADER_NAME
                    value: HEADER_VALUE
              initialDelaySeconds: DELAY
              timeoutSeconds: TIMEOUT
              failureThreshold: THRESHOLD
              periodSeconds: PERIOD

    replace

    • SERVICE with the name of your Cloud Run service.
    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • PATH with the relative path to the HTTP endpoint, for example, /ready.
    • (OPTIONAL) CONTAINER_PORT should be set to the container port used for your service.
    • (OPTIONAL) httpHeaders can be used to supply multiple or repeated custom headers using the HEADER_NAME and HEADER_VALUE fields as shown.
    • (OPTIONAL) DELAY with number of seconds to wait after the container has started before performing the first probe. Specify a value from 0 seconds to 240 seconds. The default value is 0 seconds.
    • (OPTIONAL) TIMEOUT with the number of seconds to wait until the probe times out. This value cannot exceed the value specified for periodSeconds. Specify a value from 1 to 240. The default is 1.
    • (OPTIONAL) THRESHOLD with the number of times to retry the probe before shutting down the container. The default value is 3.
    • (OPTIONAL) PERIOD with the period (in seconds) at which to perform the probe. For example 2 to perform the probe every 2 seconds. Specify a value from 1 second to 240 seconds. The default value is 10 seconds.
  4. Create or update the service using the following command:
    gcloud run services replace service.yaml

HTTP liveness

  1. If you are creating a new service, skip this step. If you are updating an existing service, download its YAML configuration:
    gcloud run services describe SERVICE --format export > service.yaml
  2. Make sure your service uses HTTP/1 (the Cloud Run default), not HTTP/2.

  3. Configure the livenessProbe attribute as shown:

    apiVersion: serving.knative.dev/v1
      kind: Service
      metadata:
        name: SERVICE
      spec:
        template:
          metadata:
          spec:
            containers:
            - image: IMAGE_URL
              livenessProbe:
                httpGet:
                  path: PATH
                  port: CONTAINER_PORT
                  httpHeaders:
                    - name: HEADER_NAME
                      value: HEADER_VALUE
                initialDelaySeconds: DELAY
                timeoutSeconds: TIMEOUT
                failureThreshold: THRESHOLD
                periodSeconds: PERIOD

    Replace

    • SERVICE with the name of your Cloud Run service.
    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • PATH with the relative path to the HTTP endpoint, for example, /ready.
    • (OPTIONAL) CONTAINER_PORT should be set to the container port used for your service.
    • (OPTIONAL) httpHeaders can be used to supply multiple or repeated custom headers using the HEADER_NAME and HEADER_VALUE fields as shown.
    • (OPTIONAL) DELAY with number of seconds to wait after the container has started before performing the first probe. Specify a value from 0 seconds to 240 seconds. The default value is 0 seconds.
    • (OPTIONAL) TIMEOUT with the number of seconds to wait until the probe times out. This value cannot exceed the value specified for periodSeconds. Specify a value from 1 to 3600. The default is 1.
    • (OPTIONAL) THRESHOLD with the number of times to retry the probe before shutting down the container. The default value is 3.
    • (OPTIONAL) PERIOD with the period (in seconds) at which to perform the probe. For example 2 to perform the probe every 2 seconds. Specify a value from 1 second to 3600 seconds. The default value is 10 seconds.
  4. Create or update the service using the following command:
    gcloud run services replace service.yaml

gRPC startup

  1. If you are creating a new service, skip this step. If you are updating an existing service, download its YAML configuration:
    gcloud run services describe SERVICE --format export > service.yaml
  2. Configure the startupProbe attribute as shown:

    apiVersion: serving.knative.dev/v1
      kind: Service
      metadata:
        name: SERVICE
      spec:
        template:
          metadata:
          spec:
            containers:
            - image: IMAGE_URL
              startupProbe:
                grpc:
                  service: GRPC_SERVICE
                  port: CONTAINER_PORT
                initialDelaySeconds: DELAY
                timeoutSeconds: TIMEOUT
                failureThreshold: THRESHOLD
                periodSeconds: PERIOD

    Replace

    • SERVICE with the name of your Cloud Run service.
    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • (OPTIONAL) GRPC_SERVICE. If set, this is used in the service field of the grpc.health.v1.HealthCheckRequest when the grpc.health.v1.Health.Check rpc is called.
    • (OPTIONAL) CONTAINER_PORT should be set to the container port used for your service.
    • (OPTIONAL) DELAY with number of seconds to wait after the container has started before performing the first probe. Specify a value from 0 seconds to 240 seconds. The default value is 0 seconds.
    • (OPTIONAL) TIMEOUT with the number of seconds to wait until the probe times out. This value cannot exceed the value specified for periodSeconds. Specify a value from 1 to 240. The default is 1.
    • (OPTIONAL) THRESHOLD with the number of times to retry the probe before shutting down the container. The default value is 3.
    • (OPTIONAL) PERIOD with the period (in seconds) at which to perform the probe. For example 2 to perform the probe every 2 seconds. Specify a value from 1 second to 240 seconds. The default value is 10 seconds.
  3. Create or update the service using the following command:
    gcloud run services replace service.yaml

gRPC liveness

  1. If you are creating a new service, skip this step. If you are updating an existing service, download its YAML configuration:
    gcloud run services describe SERVICE --format export > service.yaml
  2. Configure the livenessProbe attribute as shown:

    apiVersion: serving.knative.dev/v1
      kind: Service
      metadata:
        name: SERVICE
      spec:
        template:
          metadata:
          spec:
            containers:
            - image: IMAGE_URL
              livenessProbe:
                grpc:
                  port: CONTAINER_PORT
                  service: GRPC_SERVICE
                initialDelaySeconds: DELAY
                timeoutSeconds: TIMEOUT
                failureThreshold: THRESHOLD
                periodSeconds: PERIOD

    Replace

    • SERVICE with the name of your Cloud Run service.
    • IMAGE_URL with a reference to the container image, for example, us-docker.pkg.dev/cloudrun/container/hello:latest. If you use Artifact Registry, the repository REPO_NAME must already be created. The URL has the shape LOCATION-docker.pkg.dev/PROJECT_ID/REPO_NAME/PATH:TAG
    • (OPTIONAL) CONTAINER_PORT should be set to the container port used for your service.
    • (OPTIONAL) GRPC_SERVICE. If set, this is used in the service field of the grpc.health.v1.HealthCheckRequest when the grpc.health.v1.Health.Check rpc is called.
    • (OPTIONAL) DELAY with number of seconds to wait after the container has started before performing the first probe. Specify a value from 0 seconds to 240 seconds. The default value is 0 seconds.
    • (OPTIONAL) TIMEOUT with the number of seconds to wait until the probe times out. This value cannot exceed the value specified for periodSeconds. Specify a value from 1 to 3600. The default is 1.
    • (OPTIONAL) THRESHOLD with the number of times to retry the probe before shutting down the container. The default value is 3.
    • (OPTIONAL) PERIOD with the period (in seconds) at which to perform the probe. For example 2 to perform the probe every 2 seconds. Specify a value from 1 second to 3600 seconds. The default value is 10 seconds.
  3. Create or update the service using the following command:
    gcloud run services replace service.yaml

Terraform

Important: If you are configuring your Cloud Run service for HTTP probes, you must also add an endpoint in your service code to respond to the probe. If you are configuring a gRPC probe, you must also implement the gRPC Health Checking protocol in your Cloud Run service.

To learn how to apply or remove a Terraform configuration, see Basic Terraform commands.

TCP startup

Configure your Cloud Run service with startup_probe attribute as shown:

resource "google_cloud_run_v2_service" "default" {
  name     = "cloudrun-service-healthcheck"
  location = "us-central1"

  deletion_protection = false # set to "true" in production

  template {
    containers {
      image = "us-docker.pkg.dev/cloudrun/container/hello"

      startup_probe {
        failure_threshold     = 5
        initial_delay_seconds = 10
        timeout_seconds       = 3
        period_seconds        = 3

        tcp_socket {
          port = 8080
        }
      }
    }
  }
}

HTTP startup

Make sure your service uses HTTP/1 (the Cloud Run default), not HTTP/2.

Configure your Cloud Run service with startup_probe attribute as shown:

resource "google_cloud_run_v2_service" "default" {
  name     = "cloudrun-service-healthcheck"
  location = "us-central1"

  deletion_protection = false # set to "true" in production

  template {
    containers {
      image = "us-docker.pkg.dev/cloudrun/container/hello"

      startup_probe {
        failure_threshold     = 5
        initial_delay_seconds = 10
        timeout_seconds       = 3
        period_seconds        = 3

        http_get {
          path = "/"
          # Custom headers to set in the request
          # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloud_run_v2_service#http_headers
          http_headers {
            name  = "Access-Control-Allow-Origin"
            value = "*"
          }
        }
      }
    }
  }
}

HTTP liveness

Make sure your service uses HTTP/1 (the Cloud Run default), not HTTP/2.

Configure your Cloud Run service with liveness_probe attribute as shown:

resource "google_cloud_run_v2_service" "default" {
  name     = "cloudrun-service-healthcheck"
  location = "us-central1"

  deletion_protection = false # set to "true" in production

  template {
    containers {
      image = "us-docker.pkg.dev/cloudrun/container/hello"

      liveness_probe {
        failure_threshold     = 5
        initial_delay_seconds = 10
        timeout_seconds       = 3
        period_seconds        = 3

        http_get {
          path = "/"
          # Custom headers to set in the request
          # https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloud_run_v2_service#http_headers
          http_headers {
            name  = "Access-Control-Allow-Origin"
            value = "*"
          }
        }
      }
    }
  }
}

gRPC startup

Configure your Cloud Run service with startup_probe attribute as shown:

resource "google_cloud_run_v2_service" "default" {
  name     = "cloudrun-service-healthcheck"
  location = "us-central1"

  deletion_protection = false # set to "true" in production

  template {
    containers {
      # Note: Change to the name of your image
      image = "us-docker.pkg.dev/cloudrun/container/hello"

      startup_probe {
        failure_threshold     = 5
        initial_delay_seconds = 10
        timeout_seconds       = 3
        period_seconds        = 3

        grpc {
          # Note: Change to the name of your pre-existing grpc health status service
          service = "grpc.health.v1.Health"
        }
      }
    }
  }
}

gRPC liveness

Configure your Cloud Run service with liveness_probe attribute as shown:

resource "google_cloud_run_v2_service" "default" {
  name     = "cloudrun-service-healthcheck"
  location = "us-central1"

  deletion_protection = false # set to "true" in production

  template {
    containers {
      # Note: Change to the name of your image
      image = "us-docker.pkg.dev/cloudrun/container/hello"

      liveness_probe {
        failure_threshold     = 5
        initial_delay_seconds = 10
        timeout_seconds       = 3
        period_seconds        = 3

        # Note: Change to the name of your pre-existing grpc health status service
        grpc {
          service = "grpc.health.v1.Health"
        }
      }
    }
  }
}

Create HTTP health check endpoints

If you configure your Cloud Run service for a HTTP startup probe or liveness probe, you need to add an endpoint in your service code to respond to the probe. The endpoint can have whatever name you want, for example, /startup or /ready, but they must match the values you specify for path in the probe configuration. For example, if you specify /ready for an HTTP startup probe, you specify path in your probe configuration as shown:

startupProbe:
  httpGet:
    path: /ready

HTTP Healthcheck endpoints are externally accessible and follow the same principles as any other HTTP service endpoints that are exposed externally.