Resource model

The following diagram shows the Cloud Run resource model for services:

Cloud Run services and revisions

The diagram shows a Google Cloud project containing three Cloud Run services, Service A, Service B and Service C, each of which has several revisions.

In the diagram, Service A is receiving many requests, which results in the startup and running of several instances, each running a single container. Note that Service B is not currently receiving requests, so no instance is started yet. Service C is running multiple containers per instance within each revision: note that only the ingress container receives the request. Every instance with multiple containers scales as an independent unit.

Cloud Run services

The service is the main resource of Cloud Run. Each service is located in a specific Google Cloud region. For redundancy and failover, services are automatically replicated across multiple zones in the region they are in. A given Google Cloud project can run many services in different regions.

Each service exposes a unique endpoint and automatically scales the underlying infrastructure to handle incoming requests. You can deploy a service from a container, repository, or source code.

Cloud Run revisions

Each deployment to a service creates a revision. A revision consists of one or more container images, along with environment settings such as environment variables, memory limits, or concurrency value.

Revisions are immutable: once a revision has been created, it cannot be modified. For example, when you deploy a container image to a new Cloud Run service, the first revision is created. If you then deploy a different container image to that same service, a second revision is created. If you subsequently set an environment variable, a third revision is created, and so on.

Requests are automatically routed as soon as possible to the latest healthy service revision.

Cloud Run functions

Functions are a type of service that contains short snippets of code for building upon and connecting cloud services.

With Cloud Run, you write single-purpose functions that are attached to events emitted from your cloud infrastructure and services. Your function is triggered when an event being watched is fired. Your code executes in a fully managed environment. Because functions run as Cloud Run services, you don't need to provision any infrastructure or worry about managing any servers.

You can write Cloud Run functions using a number of supported programming languages. You can take your function and run it in any standard runtime environment for one of the supported languages, which makes it easier to port the function and test it locally.

Cloud Run function events and triggers

Cloud events are things that happen in your cloud environment. These might be things like changes to data in a database, files added to a storage system, or a new virtual machine instance being created.

Events occur whether or not you choose to respond to them. You create a response to an event with a trigger. A trigger is a declaration that you are interested in a certain event or set of events. Binding a function to a trigger lets you capture and act on events. For more information on creating triggers and associating them with your functions, see Invoke with HTTPS and Trigger with events.

Cloud Run jobs

Each job is located in a specific Google Cloud region and executes one or more containers to completion. A job consists of one or multiple independent tasks that are executed in parallel in a given job execution. Each task runs one container, and might retry it.

Cloud Run job executions

When a job is executed, a job execution is created in which all job tasks are started. All tasks in a job execution must complete successfully for the job execution to be successful. You can set timeouts on tasks and specify the number of retries in case of task failure. If any task exceeds its maximum number of retries, that task is marked as failed and the job is marked as failed. By default, tasks execute in parallel up to a maximum of 100, but you can specify a lower maximum if any of your backing resources require it.

Cloud Run instances

Each revision receiving requests is automatically scaled to the number of instances needed to handle all these requests. Note that the ingress container within an instance can receive many requests at the same time. With the concurrency setting, you can set the maximum number of requests that can be sent in parallel to a given instance.