Deploy Atlas Live Migration to migrate MongoDB to MongoDB Atlas

Last reviewed 2023-05-08 UTC

This document describes how you deploy the architecture in Use Atlas Live Migration to migrate MongoDB to MongoDB Atlas.

This document is intended for database architects, database administrators, and database engineers who are interested in a fully hosted MongoDB service, or who are responsible for migrating MongoDB databases in a MongoDB replica set to a MongoDB Atlas cluster.

Architecture

The following diagram shows the deployment architecture that you create in this document:

MongoDB servers on Compute Engine with the migration path from the primary to MongoDB Atlas.

In the diagram, an arrow represents the data migration path from the source MongoDB replica set running on Compute Engine to the target cluster running in MongoDB Atlas on Google Cloud. For more information about the architecture, see Use Atlas Live Migration to migrate MongoDB to MongoDB Atlas.

Objectives

  • Set up your self-managed source by creating and loading documents into a sample MongoDB replica set.
  • Set up a migration target cluster in MongoDB Atlas.
  • Use Atlas Live Migration to migrate a database from your self-managed MongoDB replica set to a fully managed MongoDB Atlas cluster.
  • Understand and select testing, cutover, and fallback strategies.

Costs

Deployment of this architecture uses the following billable components of Google Cloud:

To deploy this architecture, you can't use the MongoDB Atlas free tier. The available machine types in the free tier don't support Atlas Live Migration. The minimum required machine type (M10 at the time of this writing) has an hourly service cost in MongoDB Atlas. To generate a price estimate, go to MongoDB Atlas Pricing and review the Google Cloud pricing information. If you implement this migration in production, we recommend that you use the regular hosted version of MongoDB Atlas.

When you finish this deployment, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up.

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Make sure that billing is enabled for your Google Cloud project. Learn how to check if billing is enabled on a project.

Create a self-managed MongoDB replica set

To start the deployment, you install the MongoDB replica set on Google Cloud. This database serves as your source database. You then check whether your source database meets all required preconditions. This precondition check helps to prepare you for migration in a production environment. Even if a MongoDB replica already exists in your production environment, you still need to check for preconditions.

After you complete the preconditions check, you must enable authentication and restart the source MongoDB instance. Finally, to test the migration, you add a sample data set to the source MongoDB instance that is migrated to the target database.

Install the MongoDB replica set

  1. In the Google Cloud Marketplace, go to the MongoDB replica set installation on Compute Engine.

    Go to MongoDB in Cloud Marketplace

  2. Click Launch. Because several Google Cloud APIs are enabled, the launch process can take a while.

    If you have permissions for several projects, a list of projects is displayed. Select the project for your MongoDB installation.

    A MongoDB replica set is deployed on a set of Compute Engine instances according to a Deployment Manager template.

  3. Accept all the default configuration settings.

  4. Click Deploy.

  5. In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

  6. Use an SSH connection to log in to the Compute Engine instance that runs the MongoDB primary:

    gcloud compute ssh MONGODB_VM_NAME --project PROJECT_ID --zone ZONE_OF_VM
    

    Replace the following:

    • MONGODB_VM_NAME: the name of the primary replica of the MongoDB replica set.
    • PROJECT_ID: the name of your Google Cloud project.
    • ZONE_OF_VM: the zone in which your virtual machine (VM) instance resides. For more information, see Geography and regions.

    If an SSH key is generated, the system asks for a passphrase. If you don't want to provide a passphrase, press Enter. If you do provide a passphrase, note it for future reference.

    If you are unable to connect by using Cloud Shell, click SSH to a servers tier VM in the Deployment Manager.

  7. Launch the mongo shell:

    mongo
    
  8. List the existing databases:

    show dbs
    

    The output is similar to the following:

    admin   0.000GB
    config  0.000GB
    local   0.000GB
    

    Keep the mongo shell open for upcoming commands.

You have now created and accessed your MongoDB replica set and confirmed that it's operational.

Check preconditions for source database

Atlas Live Migration requires that the source MongoDB replica set meets specific configuration criteria, or preconditions. Although the source MongoDB replica set that you installed as part of this deployment complies with the required version, you still need to check for preconditions in a production environment. The precondition checks are outlined in the Atlas documentation.

To check whether all preconditions are met, do the following:

  1. In the mongo shell, check that the version of MongoDB is 2.6 or later. (In a production database instance, you open the mongo shell, connect to the MongoDB server by using an SSH connection, and then run this command to determine the version.)

    db.version()
    

    The output displays the version. If your version is earlier than 2.6, you need to follow the upgrade instructions.

  2. Check that your current deployment is a MongoDB replica set:

    rs.status()
    

    The output is a status for the MongoDB replica set. The following output shows that a MongoDB instance started without the MongoDB replica set being enabled.

    {
        "ok" : 0,
        "errmsg" : "not running with --replSet",
        "code" : 76,
        "codeName" : "NoReplicationEnabled"
    }
    

    In this case, stop and restart the MongoDB instance with the MongoDB replica set enabled. If you have a standalone instance of MongoDB, upgrade the MongoDB instance to a MongoDB replica set.

  3. Check that authentication is enabled on your source cluster by logging in:

    mongo -u YOUR_ADMIN_USERNAME -p --authenticationDatabase admin
    

    Replace the following:

    • YOUR_ADMIN_USERNAME: the administrator username of your deployment.

    The MongoDB replica set created earlier doesn't have authentication enabled.

    If authentication isn't enabled, you need to follow instructions to enable authentication. The following is an example command to enable authentication, with an example username and password:

    use admin
    db.createUser(
      {
        user: "myUserAdmin",
        pwd: "myUserAdminPassword",
        roles: [ { role: "userAdminAnyDatabase", db: "admin" }, "readWriteAnyDatabase", "clusterMonitor" ]
      }
    )
    

    After authentication is enabled, the MongoDB role clusterMonitor is required in order to execute rs.status(). The preceding command specifies this role.

  4. Check that the administrator user has the proper roles assigned for the version of the MongoDB replica set. For a list of roles that correspond to a particular version, see the discussion on source cluster security in the Atlas Live Migration documentation.

    use admin
    db.getUser("YOUR_ADMIN_USERNAME")
    

    The username must be placed between quotation marks.

  5. (Optional) If your MongoDB deployment is based on a version earlier than 4.2, it contains indexes with keys that exceed the 1024 byte index key limit. In this case, set the MongoDB server parameter failIndexKeyTooLong to false before you start the Atlas Live Migration procedure.

After you verify preconditions and make any necessary changes, you complete configurations and restart your database, which is described in the next section.

Enable authentication and restart the MongoDB replica set

To turn on authentication, you create key files and an administrator. In a production environment, you can use scripts to automate the process.

  1. In Cloud Shell, create a key file:

    openssl rand -base64 756 > PATH_TO_KEY_FILE
    

    Replace the following:

    • PATH_TO_KEY_FILE: the location where your SSH key is stored, for example, /etc/mongo-key.
  2. Enable authorization for each of the three VMs:

    1. Copy the key file to the VM:

      gcloud compute copy-files PATH_TO_KEY_FILE NAME_OF_THE_VM:PATH_TO_KEY_FILE --zone=ZONE_OF_VM
      

      Replace the following:

      • NAME_OF_THE_VM: the name of one of the VMs running a replica of the replica set.
      • ZONE_OF_VM: the Google Cloud zone where the VM resides that is referred to in NAME_OF_THE_VM.
    2. Use an SSH connection to log in to the VM and change the owner and the access permissions of the key file:

      sudo chown mongodb:mongodb PATH_TO_KEY_FILE
      
      sudo chmod 400 PATH_TO_KEY_FILE
      
    3. In your preferred text editor, open the mongod.conf file in edit mode. If you want to write back any changes, you might need to use the sudo command to start your text editor.

    4. Edit the security section of the mongod.conf file:

      security:
        authorization: enabled
        keyFile: PATH_TO_KEY_FILE
      
    5. Restart the replica:

      sudo service mongod restart
      
  3. Verify that you can log in to the primary of the MongoDB replica set:

    mongo -u YOUR_ADMIN_USERNAME -p --authenticationDatabase admin
    

Insert sample data

In the following steps, you insert sample data into the source database and then verify that the documents are successfully inserted:

  1. In Cloud Shell, use ssh to connect to the MongoDB primary Compute Engine instance:

    gcloud compute ssh MONGODB_VM_NAME --project PROJECT_ID --zone ZONE_OF_VM
    

    You might be required to provide the passphrase for the SSH key.

  2. Start the mongo shell:

    mongo -u YOUR_ADMIN_USERNAME -p --authenticationDatabase admin
    

    Provide the password that you specified when you created the administrator username.

  3. Create a database:

    use migration
    
  4. Create a collection:

    db.createCollection("source")
    
  5. Verify that the collection is empty:

    db.source.count()
    
  6. Add the following five documents as the initial data set:

    db.source.insert({"document_number": 1})
    db.source.insert({"document_number": 2})
    db.source.insert({"document_number": 3})
    db.source.insert({"document_number": 4})
    db.source.insert({"document_number": 5})
    

    The output for each of these commands is similar to the following:

    WriteResult({ "nInserted" : 1 })
    
  7. Verify that you added the five documents successfully into the collection migration. The result must be 5.

    db.source.count()
    

    After the database migration is set up and started, the documents are migrated to the target cluster in MongoDB Atlas.

Create a cluster in MongoDB Atlas

A MongoDB replica set is called a cluster in MongoDB Atlas. If you don't have a cluster set up as your target database, follow the steps in this section. These steps are based on the MongoDB documentation. If you already have a cluster set up as your target database, you can skip this section.

  1. In Cloud Marketplace, go to the MongoDB Atlas - Free Tier Installation page.

    Go to MongoDB Atlas on Marketplace

  2. Click Visit MongoDB Site to Sign Up.

  3. Click Launch your first cluster.

  4. Fill in the required information and click Get started free. Note the information that you provided.

  5. Click Advanced Configuration Options.

  6. For Cloud Provider & Region, select Google Cloud Platform and Iowa (us-central1).

  7. Click the Cluster Tier tab, and then select M10.

  8. Click the Additional Settings tab, select MongoDB 4.0 or MongoDB 4.2, and then turn off backup.

  9. Click Create Cluster.

    Wait until the creation of the cluster is completed. Note that the project name is Project 0 (with a blank space) and the cluster name is Cluster0 (without the blank space).

The target cluster is set up and running in MongoDB Atlas.

Test the failover of the MongoDB Atlas cluster

After the migration completes, the cluster in MongoDB Atlas executes a rolling restart. Each of the cluster members restarts in turn. In order to ensure that this process works, test the failover.

Start the live migration

To migrate the data from the source to the target database, do the following:

  1. Log in to MongoDB Atlas.

  2. Go to the Clusters page, and then select the cluster that you want to migrate to.

  3. In the target cluster (Cluster 0) pane, click .

  4. Select Migrate Data to this Cluster.

  5. In the window that opens, review the information. When you're ready to migrate, click I'm ready to migrate.

    A window with data migration instructions is displayed. The IP addresses that are listed must be able to access the MongoDB replica set. If you haven't created a firewall rule for those addresses, use Cloud Shell to add a firewall rule based on the following example command:

    gcloud compute firewall-rules create "allow-mongodb-atlas" --allow=tcp:27027 --source-ranges="35.170.231.208/32,3.92.230.111/32,3.94.238.78/32,54.84.208.96/32" --direction=INGRESS
    
  6. In the Hostname:Port of the primary of your replica set field, enter the IP address and port for the primary of the MongoDB replica set—for example, IP_ADDRESS:PORT_FOR_PRIMARY.

    1. To determine the primary instance, run the following command in the mongo shell on either instance that's running in your Google Cloud project:

      rs.isMaster().primary
      
    2. To look up the corresponding external IP address, go to the Compute Engine VM instances page. The standard MongoDB port is 27017.

  7. Enter the administrator username and password of your MongoDB replica set. Leave all other settings with their default values.

  8. Click Validate, and then do one of the following:

    • If the validation succeeds, click Start Migration.
    • If the validation doesn't succeed, troubleshoot by using the instructions that are provided. For example, if MongoDB Atlas can't connect to the MongoDB replica set, it provides the IP addresses from which MongoDB Atlas is trying to connect. For these addresses, add a firewall rule that allows TCP traffic on port 27017 for the servers of the MongoDB replica set.

    The MongoDB Atlas screen shows the validation progress. Wait for the message Initial Sync Complete! in the progress bar.

The initial load from the MongoDB replica set is now complete. The next step is to verify that the initial load is successful.

After the initial migration is complete, MongoDB Atlas provides an estimate of the number of hours left until you must make the cutover to the target cluster. You might also receive an email from MongoDB that provides you with the number of hours left, the ability to extend that time, and a warning that if a final cutover isn't made within the given time, the migration will be canceled.

Verify the database migration

It's important to design and implement a database migration verification strategy to confirm that the database migration is successful. While the particular verification strategy depends on your specific use case, we recommend that you perform these checks:

  • Completeness check. Verify that the initial document set successfully migrated from the source databases (initial load).
  • Dynamic check. Verify that changes in the source databases are being transferred to the target databases (ongoing migration).

First, verify that the initial load is successful:

  1. In MongoDB Atlas, click Clusters.

  2. Click Collections.

  3. Verify that a database named migrations exists and that the collection named source has five documents.

Next, verify that ongoing changes to the source databases are reflected in the target databases:

  1. In Cloud Shell, use an SSH connection to log in to the primary VM of the source MongoDB replica set.

  2. Start the mongo shell:

    mongo
    
  3. Insert another document:

    use migration
    db.source.insert({"document_number": 6})
    
  4. In the MongoDB Atlas Collections page for the migration collection, click Refresh to observe that one document is added to the collection source.

You have now verified that Atlas Live Migration has automatically migrated all original data from the source and any ongoing changes to the source.

Test your Atlas target cluster

In a production environment, it's important to test applications that access target databases to ensure that they function properly. This section discusses several testing strategies.

Test applications with a target database during a migration

As the preceding section demonstrates, you can perform application testing during an ongoing database migration. This approach might work if applications don't change the target in such a way that it conflicts with data being migrated from the source databases. Whether this approach is an option for you depends on your environment and dependencies. If the test application writes data to the target database, it might conflict with the ongoing migration.

Test applications with a temporary target database

If you can't test applications during a production database migration, then you might migrate the data to temporary target databases that you use only for testing, and then delete the test targets after a test migration.

For this method, you stop the test migration at some point (as if the database migration is completed) and then test applications against these test databases. After testing is complete, you delete the target databases and start the production database migration to migrate the data to permanent target databases. The benefit of this strategy is that the target databases can be read and written because they are only for testing.

Test applications with a target database after migration is complete

If neither of the preceding strategies is viable, the remaining strategy is to test the application on the database after the migration is complete. After all data is in the target databases, test the applications before making them available to users. If testing includes writing data, then it's important that test data is written, not production data, in order to avoid production data inconsistencies. To avoid data inconsistencies or superfluous data in the target database, the test data must be removed after the tests are completed.

We recommended that you back up the target databases before opening them to production access by application systems. This step helps ensure that there is a consistent starting point that you can recreate, if needed.

Cut over from the source MongoDB replica set to the target cluster

After you complete any tests and verify that ongoing changes are reflected in the target database, you can plan the cutover.

First, you need to stop any changes to the source database so that Atlas Live Migration can drain the not-yet-migrated changes to the target. After all changes are captured in the target, you can initiate the Atlas Live Migration cutover process. After that process is complete, you can switch clients from the source to the target databases.

  1. In MongoDB Atlas, click Clusters.

  2. In the Cluster0 pane, click Prepare to Cutover. A step-by-step explanation of the cutover process and a connection string to the target cluster are displayed.

  3. Click Cut over.

    When the migration is complete, the message Success! Your cluster migration is complete. is displayed.

You have now successfully migrated your MongoDB replica set to a MongoDB Atlas cluster.

Prepare a fallback strategy

After a cutover is finished, the target cluster is the system of record; the source databases are out of date and eventually removed. However, you might want to fall back to the source databases in case of severe failures on the new target databases. For example, a failure can occur if business logic in an application isn't executed during testing and then later fails to function properly. Another failure can occur when the performance or latency behavior doesn't match the source databases and causes errors.

To fall back from such failures, you might want to keep the original s