Migrate from AWS to Google Cloud: Migrate from Amazon S3 to Cloud Storage

Last reviewed 2024-07-30 UTC

Google Cloud provides tools, products, guidance, and professional services to help you migrate data from Amazon Simple Storage Service (Amazon S3) to Cloud Storage. This document discusses how to design, implement, and validate a plan to migrate from Amazon S3 to Cloud Storage. The document describes a portion of the overall migration process in which you create an inventory of Amazon S3 artifacts and create a plan for how to handle the migration process.

The discussion in this document is intended for cloud administrators who want details about how to plan and implement a migration process. It's also intended for decision-makers who are evaluating the opportunity to migrate and who want to explore what migration might look like.

This document is part of a multi-part series about migrating from AWS to Google Cloud that includes the following documents:

For this migration to Google Cloud, we recommend that you follow the migration framework described in Migrate to Google Cloud: Get started.

The following diagram illustrates the path of your migration journey.

Migration path with four phases.

You might migrate from your source environment to Google Cloud in a series of iterations—for example, you might migrate some workloads first and others later. For each separate migration iteration, you follow the phases of the general migration framework:

  1. Assess and discover your workloads and data.
  2. Plan and build a foundation on Google Cloud.
  3. Migrate your workloads and data to Google Cloud.
  4. Optimize your Google Cloud environment.

For more information about the phases of this framework, see Migrate to Google Cloud: Get started.

To design an effective migration plan, we recommend that you validate each step of the plan, and ensure that you have a rollback strategy. To help you validate your migration plan, see Migrate to Google Cloud: Best practices for validating a migration plan.

Assess the source environment

In the assessment phase, you determine the requirements and dependencies to migrate your source environment to Google Cloud.

The assessment phase is crucial for the success of your migration. You need to gain deep knowledge about the workloads you want to migrate, their requirements, their dependencies, and about your current environment. You need to understand your starting point to successfully plan and execute a Google Cloud migration.

The assessment phase consists of the following tasks:

  1. Build a comprehensive inventory of your workloads.
  2. Catalog your workloads according to their properties and dependencies.
  3. Train and educate your teams on Google Cloud.
  4. Build experiments and proofs of concept on Google Cloud.
  5. Calculate the total cost of ownership (TCO) of the target environment.
  6. Choose the migration strategy for your workloads.
  7. Choose your migration tools.
  8. Define the migration plan and timeline.
  9. Validate your migration plan.

For more information about the assessment phase and these tasks, see Migrate to Google Cloud: Assess and discover your workloads. The following sections are based on information in that document.

Build an inventory of your Amazon S3 buckets

To scope your migration, you create two inventories: an inventory of your Amazon S3 buckets, and an inventory of the objects that are stored in the buckets.

After you build the inventory of your Amazon S3 buckets, refine the inventory by considering the following data points about each Amazon S3 bucket:

  • How you've configured Amazon S3 bucket server-side encryption.
  • Your settings for Amazon S3 bucket identity and access management.
  • The configuration for S3 Block Public Access.
  • Any cost allocation tags for Amazon S3 buckets.
  • The configuration for S3 Object Lock.
  • How you're accessing the Amazon S3 bucket.
  • How you've configured Requester Pays.
  • The settings for Amazon S3 object versioning.
  • The configuration for AWS Backup policies for Amazon S3.
  • Whether you're using Amazon S3 Intelligent-Tiering.
  • How you've configured for Amazon S3 object replication.
  • The Amazon S3 object lifecycle.

We also recommend that you gather data about your Amazon S3 buckets that lets you compute aggregate statistics about the objects that each bucket contains. For example, if you gather the total object size, average object size, and object count, it can help you estimate the time and cost that's needed to migrate from an Amazon S3 bucket to a Cloud Storage bucket.

To build the inventory of your Amazon S3 buckets and to gather data points about your Amazon S3 buckets, you can implement data-collection mechanisms and processes that rely on AWS tools, such as the following:

  • Amazon S3 monitoring tools
  • S3 Analytics
  • AWS Multi-Account Multi-Region Data Aggregation
  • AWS APIs
  • AWS developer tools
  • The AWS command-line interface

To help you avoid issues during the migration, and to help estimate the effort needed for the migration, we recommend that you evaluate how Amazon S3 bucket features map to similar Cloud Storage bucket features. The following table summarizes this mapping.

Amazon S3 feature Cloud Storage feature
Bucket naming rules Bucket name requirements
Bucket location Bucket location
Server-side encryption Encryption options
Identity and access management Identity and Access Management (IAM)
Public access Public data access
Public access prevention
Cost allocation S3 bucket tags Tags and labels
S3 Object Lock Retention policies and retention policy lock
Methods for accessing an Amazon S3 bucket Uploads and downloads
Requester Pays Requester Pays
Object versioning Object versioning
AWS Backup policies for Amazon S3 Event-driven transfer jobs
Intelligent-Tiering Autoclass
Object replication Redundancy across regions and turbo replication
Event-driven transfer jobs
Object lifecycle Object Lifecycle Management

As noted earlier, the features listed in the preceding table might look similar when you compare them. However, differences in the design and implementation of the features in the two cloud providers can have significant effects on your migration from Amazon S3 to Cloud Storage.

Build an inventory of the objects stored in your Amazon S3 objects

After you build the inventory of your Amazon S3 buckets, we recommend that you build an inventory of the objects stored in these buckets by using the Amazon S3 inventory tool.

To build the inventory of your Amazon S3 objects, consider the following for each object:

  • Amazon S3 object name
  • Amazon S3 object size
  • Amazon S3 object metadata
  • Amazon S3 object subresources
  • Amazon S3 object versions, and if you need to migrate these versions
  • Amazon S3 object presigned URLs
  • Amazon S3 object transformations
  • Amazon S3 object tags
  • Amazon S3 object storage classes
  • Amazon S3 object archiving

We also recommend that you gather data about your Amazon S3 objects to understand how often you and your workloads create, update, and delete Amazon S3 objects.

To help you avoid issues during the migration, and to help estimate the effort needed for the migration, we recommend that you evaluate how Amazon S3 object features map to similar Cloud Storage object features. The following table summarizes this mapping.

Amazon S3 feature Cloud Storage feature
Object naming rules Object name requirements
Object metadata
Object tags
Object metadata
Object subresources Object metadata
Object presigned URLs Signed URLs
Object transformations Pub/Sub notifications for Cloud Storage
Cloud Run functions
Cloud Run
Object storage classes
Object archiving
Cloud Storage storage classes

As noted earlier, the features listed in the preceding table might look similar when you compare them. However, differences in the design and implementation of the features in the two cloud providers can have significant effects on your migration from Amazon S3 to Cloud Storage.

Complete the assessment

After you build the inventories from your Amazon S3 environment, complete the rest of the activities of the assessment phase as described in Migrate to Google Cloud: Assess and discover your workloads.

Plan and build your foundation

In the plan and build phase, you provision and configure the infrastructure to do the following:

  • Support your workloads in your Google Cloud environment.
  • Connect your source environment and your Google Cloud environment to complete the migration.

The plan and build phase is composed of the following tasks:

  1. Build a resource hierarchy.
  2. Configure Google Cloud's Identity and Access Management (IAM).
  3. Set up billing.
  4. Set up network connectivity.
  5. Harden your security.
  6. Set up logging, monitoring, and alerting.

For more information about each of these tasks, see the Migrate to Google Cloud: Plan and build your foundation.

Migrate data and workloads from Amazon S3 to Cloud Storage

To migrate data from Amazon S3 to Cloud Storage, we recommend that you design a data migration plan by following the guidance in Migrate to Google Cloud: Transfer your large datasets. That document recommends using Storage Transfer Service, a Google Cloud product that lets you migrate data from several sources to Cloud Storage, such as from on-premises environments or from other cloud storage providers. Storage Transfer Service supports several types of data transfer jobs, such as the following:

  • Run-once transfer jobs, which transfer data from Amazon S3 or other supported sources to Cloud Storage on demand.
  • Scheduled transfer jobs, which transfer data from Amazon S3 or other supported sources to Cloud Storage on a schedule.
  • Event-driven transfer jobs, which automatically transfer data when Amazon S3 sends Amazon S3 Event Notifications to Amazon Simple Queue Service (SQS).

To implement a data migration plan, you can configure one or more data transfer jobs. For example, to reduce the length of cut-over windows during the migration, you can implement a continuous replication data migration strategy as follows:

  1. Configure a run-once transfer job to copy the data from an Amazon S3 bucket to the Cloud Storage bucket.
  2. Perform data validation and consistency checks to compare data in the Amazon S3 bucket against the copied data in the Cloud Storage bucket.
  3. Set up event-driven transfer jobs to automatically transfer data from the Amazon S3 bucket to the Cloud Storage bucket when the content of the Amazon S3 bucket changes.
  4. Stop the workloads and services that have access to the data that's being migrated (that is, to the data that's involved in the previous step).
  5. Refactor workloads to use Cloud Storage instead of Amazon S3. You can refactor your workloads by using one of the following approaches, or by using the approaches in sequence:

  6. Wait for the replication to fully synchronize Cloud Storage with Amazon S3.

  7. Start your workloads.

  8. When you no longer need your Amazon S3 environment as a fallback option, retire it.

Storage Transfer Service can preserve certain metadata when you migrate objects from a supported source to Cloud Storage. We recommend that you assess whether Storage Transfer Service can migrate the Amazon S3 metadata that you're interested in.

When you design your data migration plan, we recommend that you also assess AWS network egress costs and your Amazon S3 costs. For example, consider the following options to transfer data:

  • Across the public internet.
  • By using an interconnect link.
  • By using Amazon CloudFront.

The option that you choose can have an impact on your AWS network egress costs and your Amazon S3 costs. The option can also affect the amount of effort and resources that you need in order to provision and configure the infrastructure. For more information about costs, see the following:

When you migrate data from Amazon S3 to Cloud Storage, we recommend that you use VPC Service Controls to build a perimeter that explicitly denies communication between Google Cloud services unless the services are authorized.

Optimize your Google Cloud environment

Optimization is the last phase of your migration. In this phase, you iterate on optimization tasks until your target environment meets your optimization requirements. The steps of each iteration are as follows:

  1. Assess your current environment, teams, and optimization loop.
  2. Establish your optimization requirements and goals.
  3. Optimize your environment and your teams.
  4. Tune the optimization loop.

You repeat this sequence until you've achieved your optimization goals.

For more information about optimizing your Google Cloud environment, see Migrate to Google Cloud: Optimize your environment and Google Cloud Architecture Framework: Performance optimization.

What's next

Contributors

Author: Marco Ferrari | Cloud Solutions Architect