Cloud Composer 3 | Cloud Composer 2 | Cloud Composer 1
This page describes how to set up highly resilient Cloud Composer environments.
About resiliency for zonal failures in Cloud Composer
Highly resilient Cloud Composer environments use built-in redundancy and failover mechanisms that reduce the environment's susceptibility to zonal failures and single point of failure outages.
For example, a zonal outage interrupts Airflow tasks that run in a specific zone. Afterwards, a highly resilient environment recovers, restarts its affected components in a different zone, and switches its database to a secondary zone. Thus, the failed Airflow tasks can be rescheduled and restarted by Airflow, while at the same time preserving the history of DAG runs and other settings.
A highly resilient environment runs across at least two zones of a selected region. Cloud Composer automatically distributes the components of your environment between zones.
You can use highly resilient Cloud Composer environments for critical business processes.
About highly available database of your environment
In highly available Cloud Composer environments, the Cloud SQL instance that stores the database of your environment runs in the high availability mode. A Cloud SQL instance configured for high availability is also called a regional instance and is located in a primary and secondary zone within the configured region. Within a regional instance, the configuration is made up of a primary instance and a standby instance.
In case of an outage, the Cloud SQL instance of your environment performs the automatic database failover to the standby Cloud SQL instance. You do not need to perform any additional actions in your Cloud Composer environment. Once the primary zone is operational again, the environment switches back to having two zones (primary and secondary). Primary and secondary zones can be swapped in some cases. The Cloud SQL instance in high availability mode uses the same IP address after a failover.
About highly available Airflow components
Highly available Cloud Composer environments run Airflow components that are distributed between zones.
Your environment always runs exactly two Airflow schedulers, two web servers, and at least two (but no more than ten) triggerers if triggerers are enabled. These pairs of components run in separate zones. The minimum number of workers is set to two, and your environment's cluster distributes worker instances between zones. In case of a zonal outage, affected worker instances are rescheduled in a different zone.
For more information about the architecture of highly resilient environments, see Highly resilient environment architecture.
Before you begin
Highly resilient environments are available only in Private IP environments.
Highly resilient environments are offered at an incremental charge when compared to regular environments.
Highly resilient environments are available in Cloud Composer version 2.2.0 and later versions.
If you want to update a standard environment to a highly resilient one, make sure that it meets the following configuration requirements. If your environment doesn't meet these requirements, you can update its scale and performance parameters.
- The minimum number of Airflow workers is 2 or more.
- The number of Airflow schedulers is exactly 2.
- If you use deferrable operators in your DAGs, then at least 2 triggerers.
Create a highly resilient environment
To create a highly resilient environment, enable the high resilience mode when you create an environment.
Update a standard environment to high resilience mode
Console
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
Select the Environment configuration tab.
In the Resilience mode section, click Edit.
Select High resilience and click Save.
gcloud
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--enable-high-resilience
Replace the following:
ENVIRONMENT_NAME
: the name of your environment.LOCATION
: the region where the environment is located.
API
Construct an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.resilienceMode
mask.In the request body, specify,
HIGH_RESILIENCE
to switch to the high resilience mode.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.resilienceMode
{
"config": {
"resilience_mode": { "HIGH_RESILIENCE" }
}
}
Terraform
The resilience_mode
field in the config
block specifies the resilience
mode. To use the high resilience mode, set this value to HIGH_RESILIENCE
.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
resilience_mode = "HIGH_RESILIENCE"
}
}
Replace the following:
ENVIRONMENT_NAME
: the name of your environment.LOCATION
: the region where the environment is located.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
resilience_mode = "HIGH_RESILIENCE"
}
Change a highly resilient environment to standard resilience mode
You can change your environment to standard resilience mode at any time. This operation:
- Reduces the number of web servers in your environment to 1.
- Switches off the high availability mode of your environment's Airflow database.
Doesn't change the settings for minimum number of Airflow workers, schedulers, or triggerers.
Console
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
Select the Environment configuration tab.
In the Resilience mode section, click Edit.
Select Standard resilience (default) and click Save.
gcloud
gcloud composer environments update ENVIRONMENT_NAME \
--location LOCATION \
--disable-high-resilience
Replace the following:
ENVIRONMENT_NAME
: the name of your Cloud Composer environmentLOCATION
: the region where the environment is located.
API
Construct an
environments.patch
API request.In this request:
In the
updateMask
parameter, specify theconfig.resilienceMode
mask.In the request body, specify,
RESILIENCE_MODE_UNSPECIFIED
to switch to the standard resilience mode.
Example:
// PATCH https://composer.googleapis.com/v1/projects/example-project/
// locations/us-central1/environments/example-environment?updateMask=
// config.resilienceMode
{
"config": {
"resilience_mode": { "RESILIENCE_MODE_UNSPECIFIED" }
}
}
Terraform
The resilience_mode
field in the config
block specifies the resilience
mode. To use the standard resilience mode, set this value to
STANDARD_RESILIENCE
.
resource "google_composer_environment" "example" {
provider = google-beta
name = "ENVIRONMENT_NAME"
region = "LOCATION"
config {
resilience_mode = "STANDARD_RESILIENCE"
}
}
Replace the following:
ENVIRONMENT_NAME
: the name of your environment.LOCATION
: the region where the environment is located.
Example:
resource "google_composer_environment" "example" {
provider = google-beta
name = "example-environment"
region = "us-central1"
config {
resilience_mode = "STANDARD_RESILIENCE"
}
Check if your environment runs in the high resilience mode
Console
In Google Cloud console, go to the Environments page.
In the list of environments, click the name of your environment. The Environment details page opens.
Select the Environment configuration tab.
In the Resilience mode section, view the resilience mode of your environment.
gcloud
To check if the high resilience mode is enabled in your environment, run the
following Google Cloud CLI command. The value of True
means that high
resilience mode is enabled in your environment.
gcloud composer environments describe ENVIRONMENT_NAME \
--location LOCATION \
--format="value(config.resilienceMode)"
Replace the following:
ENVIRONMENT_NAME
: the name of your Cloud Composer environmentLOCATION
: the region where the environment is located.
What's next
- Perform failover tests for your highly resilient environment.
- Disaster recovery with environment snapshots