Data protection with multi-zone storage

This document provides information for protecting your application data in a Google Distributed Cloud (GDC) air-gapped multi-zone universe. To maintain highly available applications, you can implement a data protection strategy that is resilient to local outages or failures. GDC provides data replication strategies for object storage and block storage so you can maintain failover procedures for primary and secondary zones in your universe.

This document is for IT administrators within the platform administrator group who are responsible for developing disaster recovery workflows, and application developers within the application operator group who are responsible for developing and maintaining applications in a GDC universe.

For more information, see Audiences for GDC air-gapped documentation.

Storage replication for disaster recovery

You can set up robust data protection for your application storage in a multi-zone universe using asynchronous data replication for disaster recovery. This approach involves copying data from a primary zone to a secondary zone at periodic intervals. This mechanism keeps your data protected and accessible if the primary zone experiences an outage.

Data replication for object storage uses dual zone buckets to automatically replicate your data, and doesn't require manual intervention. For more information about creating a dual zone bucket, see Create storage buckets.

Data replication for block storage uses dual zone persistent volumes to replicate your data, and requires a volume failover procedure. For more information, see Replicate volumes asynchronously.

After you configure data replication, your data follows a failover procedure when the primary zone is offline. The failover procedures are distinct for block and object storage replication. However, both data replication strategies use the following critical steps:

  1. Verify the primary zone outage.
  2. Stop the replication from the primary zone.
  3. Promote the backup secondary zone to assume the role of the primary zone with manual intervention or a pre-configured failover.
  4. Verify the operational status of the new primary zone.

Reach out to a member of the infrastructure operator group to confirm your two zones are configured for asynchronous data replication.

The inherent delay that comes with asynchronous data replication means that this setup is most useful for systems that require a low, but non-zero recovery point objective (RPO). If your system requires minimal data loss, but can tolerate a small predefined maximum amount of data loss measured in time, usually related to data generated just immediately before a disaster event that could be potentially unrecoverable, then asynchronous data replication is a valuable feature to implement for your applications.

An example of a low non-zero RPO might be a financial trading platform with an RPO of five minutes, where asynchronous data replication is set to copy trade data to a secondary disaster recovery zone every two minutes:

  • This is a low RPO scenario because the five minutes represents the minimum acceptable data loss window for the high-volume system.
  • It's a non-zero RPO scenario because the inherent delay in asynchronous replication of two minute intervals means that there is a small window of time where data has not yet been copied, resulting in potential loss.

You must work with your infrastructure operator group to define your dual zone asynchronous storage replication workflow, and verify the infrastructure's data replication capabilities support your RPO requirements.

What's next