Important considerations before Aurora to Cloud SQL for MySQL migration
Amazon’s AWS and Google Cloud both provide cloud-based, fully managed MySQL database services. Both cloud services providers have unique features and differences in their respective default configurations. These differences may lead to unanticipated performance or operational issues after a migration from one provider to the other. This article will provide a summary of the issues that can arise during such a migration and their recommended solutions. In particular, the focus will be on migrations of MySQL database services from Aurora MySQL to Cloud SQL for MySQL.
Migration considerations
Character set: performance issue
Aurora uses the default character set server latin1 (until MySQL v 5.7). This is different from the Cloud SQL for MySQL’s default configuration for databases, tables, stored procedures, and functions, which are created with utf8 as the default character set during a migration. This can result in situations that lead to performance problems.
For example, a user can end up in a situation where tables are created with the latin1 character set and stored procedures or functions are created with the Cloud SQL default utf8 character set. In such cases, if the stored procedure or function expects variables as utf8 parameters and compares that variable with the column value, which is in latin1 character set, it will result in performance issues because of comparison of two different character sets and collations.
You can check the character set of routines using the below command:
mysql> select ROUTINE_NAME, ROUTINE_SCHEMA, CHARACTER_SET_NAME, COLLATION_NAME from information_schema.ROUTINES; |
If you were using the default character set in Aurora (until v 5.7) which is latin1, then the default character set should be changed from utf8 to latin1 in Cloud SQL before importing the data.
Another solution could be to change everything to utf8, however, in this case users should test the complete application and dataset as changes to the character set can result in unexpected data representations.
Users can edit this setting in the Cloud SQL instance under the flag section as illustrated below.
Character set: incompatibility issue
Aurora MySQL 5.7 has many collations (for example utf8mb4_0900_ai_ci for character set utf8mb4) which are currently only available in Cloud SQL for MySQL 8.0. If you are using any such collations and import the data in Cloud SQL for MySQL 5.7, you will get an error message like "Error 'Character set '#255' is not a compiled character set".
The solution is to change those collations to available collations in MySQL version 5.7.
Storage engine: all tables must be in InnoDB
Cloud SQL for MySQL doesn’t support the MyISAM engine unlike Aurora. It's advisable to convert all the tables into InnoDB before starting the Migration from Aurora to Cloud SQL.
If there exists a table in an engine other than InnoDB, users can change the table’s engine using “ALTER” command:
mysql> alter table table_1 engine='Innodb' ; Query OK, 0 rows affected (2.89 sec) Records: 0 Duplicates: 0 Warnings: 0 |
Note: The query time depends on the table size and it may lock other operations on the same table. You can also use online schema changes tools like pt-online-schema-change or gh-ost to alter the table without blocking other operations.
Endpoints for read connections
In Aurora users can set up multiple readers behind a single endpoint, however, Cloud SQL for MySQL users doesn’t have this feature out of the box. Every read-replica in Cloud SQL for MySQL has its own IP and users need to use something like ProxySQL to split the traffic among multiple read replica instances.
Aurora doesn’t have a change buffer
The change buffer is a special data structure that caches changes to secondary index pages when those pages are not in the buffer pool and merged later when those pages are loaded into the buffer pool by other read operations. For more details read Change Buffer.
For the use case where workload has lots of writes on the tables with secondary indexes, Aurora may perform slower than Cloud SQL for MySQL, which uses the default InnoDB change buffer feature to defer those updates to later stages. Users should benchmark the performance as per their application workload.
Query cache may impact performance
Query cache stores the select command along with its result in an intermediate storage layer. If an identical statement is received later, the server checks and retrieves the results from the query cache instead of executing that command again. Query cache is shared among sessions, so all sessions can benefit from results cached by already executed statements from other sessions. Read more about query cache.
Aurora enables query cache by default, however, MySQL Community disabled the query cache in the 5.7 version and completely removed it in the 8.0 version, Cloud SQL MySQL also did the same. If your queries rely on the query cache feature of Aurora, the performance may vary in Cloud SQL MySQL. It is recommended to test your queries performance in Cloud SQL MySQL by comparing the execution time with Aurora.
Replication mechanism may impact performance
For read replicas within a region, Aurora uses the concept of cluster volume which has copies of the data across three availability zones within that region. The replication lag in Aurora is usually much less than 100 milliseconds because the primary and the replicas within the same database cluster all see the data in the cluster volume as a single logical volume. Furthermore, for cross region read replica Aurora uses disk based data sync across regions instead of binary log based replication.
In short, replication is handled by the storage layer in Aurora whereas in Cloud SQL for MySQL the standard replication mechanism of transferring the binary log to the replica instance and then replaying those logs in replica MySQL instances is used. We can improve the performance of replication by configuring the parallel replication in Cloud SQL. Read details about setting up the parallel replication.
Although the replication lag depends on the amount of data changed by the application and the network between the primary and replica instances, most applications work fine without noticeable lag in both Aurora and Cloud SQL for MySQL. However, if the application is write heavy and the application is reading from the replicas then we suggest benchmarking the replication lag in both AWS Aurora and the CloudSQL MySQL before migration.
Global transaction identifier (GTID) based replication
Cloud SQL for MySQL uses GTID replication unlike AWS Aurora which uses disk based data synchronisation. Users should check for the limitations of GTID listed below before migration and do any required changes in their application if application workflow is dependent on any of those features:
- CREATE TABLE ... SELECT statements are not allowed when using GTID-based replication.
- CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE statements are not supported inside transactions, procedures, functions, and triggers. It is possible to use these statements with GTIDs enabled, but only outside of any transaction, and only with autocommit=1.
For more about GTID-based limitations read Limitations in GTID.
Related products and services
Google Cloud offers a managed MySQL database built to suit your business needs, from retiring your on-premises data center, to running SaaS applications, to migrating core business systems.