To optimize your database fleet health, we recommend that you follow these best practices for monitoring, availability, and data protection.
Monitoring
Database monitoring, which entails tracking a database's performance and resources, is critical for maintaining the health of a database management system.
Perform regular, frequent database health checks
To ensure the health of your database, we recommend that you check the Database Center dashboard regularly (for example, once a week). Database Center doesn't notify you about database health issues. A regular check of your organization's database health issues helps you proactively identify and resolve database health issues.
Availability configuration
You can edit configurations to improve the durability and reliability of your databases.
Ensure your resource is failover-protected
When a resource is available in multiple zones, it is considered highly available (HA) and is protected by automatic failover. The purpose of an HA configuration is to reduce downtime when a zone or instance becomes unavailable. This might happen during a zonal outage or when a resource runs out of memory. High availability means that regardless of outages, your data will be available to client applications.
Configuring a resource to have high availability provides data redundancy within a region. Highly available resources have a primary node and a standby node, each in a different zone. Through synchronous replication to each zone's persistent disk, all writes made to the primary node are replicated to disks in both zones before a transaction is reported as committed. In the event of a node or zone failure, the standby node becomes the new primary node, and users are rerouted to the new primary node. This process is called a failover.
Use cross-region replication
When a resource group is available in multiple regions, it's using a feature called cross-region replication, which is a way to asynchronously replicate data and applications across regions. We recommend that you use cross-region replication for the following reasons:
- Disaster recovery: If the region for a primary resource becomes unavailable, you can promote a secondary resource in another region to become the primary resource and use it to serve requests.
- Geographically distributed data: Locating your data closer to the applications that need the data can reduce read latency.
- Geographic load balancing: If slow or overloaded connections occur in one region, you can route traffic to another region.
- Improved read performance: Provisioning read-only resources around the world can improve capacity and performance in those areas.
Data protection
Data protection is important because it can help protect organization data against loss, manipulation, and illegal access.
Enable automated backups
Backups help you restore lost data to your database resources, and they protect your data from loss or damage. If a database resource experiences a problem, you can restore it to a previous state. Enable automated backups for any resource that contains necessary data.
Set up long backup retention windows
Your backup retention settings determine the window during which you can recover data if your data experiences errors, corruption, or losses. The longer your backup retention period, the larger your recovery window is for that resource.