Enterprise application on Compute Engine VMs with Oracle Exadata in Google Cloud

Last reviewed 2024-09-30 UTC

This document provides a reference architecture for a highly available enterprise application that is hosted on Compute Engine virtual machines (VMs) with low-latency connectivity to Oracle Cloud Infrastructure (OCI) Exadata databases that run in Google Cloud. The intended audience for this document is cloud architects and Oracle database administrators. The document assumes that you're familiar with Compute Engine and Oracle Exadata Database Service.

If you use Oracle Exadata or Oracle Real Application Clusters (Oracle RAC) to run Oracle databases on-premises, you can efficiently migrate your applications to Google Cloud and run your databases on Oracle Database@Google Cloud. Oracle Database@Google Cloud is a Google Cloud Marketplace offering that lets you run Oracle Exadata Database Service and Oracle Autonomous Database directly inside Google Cloud.

If you don't need the Oracle RAC capability or if you need an Oracle Database version other than 19c and 23ai, then you can run self-managed Oracle databases on Compute Engine VMs. For more information, see Enterprise application with Oracle Database on Compute Engine.

Architecture

The following diagram shows a high-level view of the architecture:

A high-level view of an architecture that uses Oracle Database@Google Cloud.

In the preceding diagram, an external load balancer receives requests from users of a public-facing application and it distributes the requests to frontend web servers. The web servers forward the user requests through an internal load balancer to application servers. The application servers read data from and write to databases in Oracle Database@Google Cloud. Administrators and OCI services can connect and interact with the Oracle databases.

The following diagram shows a detailed view of the architecture:

A detailed view of an architecture that uses Oracle Database@Google Cloud.

In this architecture, the web tier and application tier run in active-active mode on Compute Engine VMs that are distributed across two zones within a Google Cloud region. The application uses Oracle Exadata databases in the same Google Cloud region.

All the components in the architecture are in a single Google Cloud region. This architecture is aligned with the regional deployment archetype. You can adapt this architecture to build a topology that is robust against regional outages by using the multi-regional deployment archetype. For more information, see Multi-regional deployment on Compute Engine and also the guidance in the Reliability section later in this document.

The architecture that's shown in the preceding diagram includes the following components:

Component Purpose
Regional external Application Load Balancer The regional external Application Load Balancer receives user requests and distributes them to the web tier VMs.
Google Cloud Armor security policy The Google Cloud Armor security policy helps to protect your application stack against threats like distributed denial-of-service (DDoS) attacks and cross-site scripting (XSS).
Regional managed instance group (MIG) for the web tier The web tier of the application is deployed on Compute Engine VMs that are part of a regional MIG. This MIG is the backend for the external Application Load Balancer. The MIG contains Compute Engine VMs in two zones. Each of these VMs hosts an independent instance of the web tier of the application.
Regional internal Application Load Balancer The regional internal Application Load Balancer distributes traffic from the web tier VMs to the application tier VMs.
Regional MIG for the application tier The application tier, such as an Oracle WebLogic Server cluster, is deployed on Compute Engine VMs that are part of a regional MIG. This MIG is the backend for the internal Application Load Balancer. The MIG contains Compute Engine VMs in two zones. Each VM hosts an independent instance of the application server.
Virtual Private Cloud (VPC) network and subnet All of the Google Cloud resources in the architecture use a single VPC network. Depending on your requirements, you can choose to build an architecture that uses multiple networks. For more information, see Deciding whether to create multiple VPC networks.
Oracle Database@Google Cloud

The application servers read data from and write to Oracle databases in Oracle Exadata Database Service. You provision Oracle Exadata Database Service by using Oracle Database@Google Cloud, a Cloud Marketplace offering that lets you run Oracle databases on Oracle-managed hardware within a Google Cloud data center.

You use Google Cloud interfaces like the Google Cloud console, Google Cloud CLI, and APIs to create Exadata Infrastructure instances. Oracle sets up and manages the required compute, storage, and networking infrastructure in a data center within a Google Cloud region on hardware that's dedicated for your project.

Exadata Infrastructure instances Each Exadata Infrastructure instance contains two or more physical database servers and three or more storage servers. These servers, which aren't shown in the diagram, are interconnected using a low-latency network fabric. When you create an Exadata Infrastructure instance, you specify the number of database servers and storage servers that must be provisioned.
Exadata VM Clusters

Within an Exadata Infrastructure instance, you create one or more Exadata VM Clusters. For example, you can choose to create and use a separate Exadata VM Cluster to host the databases that are required for each of your business units. Each Exadata VM Cluster contains one or more Oracle Linux VMs that host Oracle Database instances.

When you create an Exadata VM Cluster, you specify the following:

  • The number of database servers.
  • The compute, memory, and storage capacity to be allocated to each VM in the cluster.
  • The VPC network that the cluster must connect to.
  • IP address ranges of the backup and client subnets for the cluster.

The VMs within Exadata VM Clusters are not Compute Engine VMs.

Oracle Database instances You create and manage Oracle databases through the OCI console and other OCI interfaces. Oracle Database software runs on the VMs within the Exadata VM Cluster. When you create the Exadata VM Cluster, you specify the Oracle Grid Infrastructure version. You also choose the license type: either bring your own licenses (BYOL) or opt for the license-included model.
OCI VCN and subnets When you create an Exadata VM Cluster, an OCI virtual cloud network (VCN) is created automatically. The VCN has a client subnet and a backup subnet with IP address ranges that you specify. The client subnet is used for connectivity from your VPC network to the Oracle databases. The backup subnet is used to send database backups to OCI Object Storage.
Cloud Router, Partner Interconnect, and OCI DRG Traffic between your VPC network and the VCN is routed by a Cloud Router that's attached to the VPC and through a dynamic routing gateway (DRG) that's attached to the VCN. The traffic flows through a low-latency connection that Google sets up using Partner Interconnect.
Private Cloud DNS zone When you create an Exadata VM Cluster, a Cloud DNS private zone is created automatically. When your application servers send read and write requests to the Oracle databases, Cloud DNS resolves the database hostnames to the corresponding IP addresses.
OCI Object Storage and OCI Service Gateway By default, backups of the Oracle Exadata databases are stored in OCI Object Storage. Database backups are routed to OCI Object Storage through a Service Gateway.
Public Cloud NAT gateway The architecture includes a public Cloud NAT gateway to enable secure outbound connections from the Compute Engine VMs, which have only internal IP addresses.
Cloud Interconnect and Cloud VPN To connect your on-premises network to the VPC network in Google Cloud, you can use Cloud Interconnect or Cloud VPN. For information about the relative advantages of each approach, see Choosing a Network Connectivity product.
Cloud Monitoring You can use Cloud Monitoring to observe the behavior, health, and performance of your application and Google Cloud resources, including the Oracle Exadata resources. You can also monitor the resources in Oracle Exadata resources by using the OCI Monitoring service.

Products used

This reference architecture uses the following Google Cloud products:

  • Compute Engine: A secure and customizable compute service that lets you create and run VMs on Google's infrastructure.
  • Cloud Load Balancing: A portfolio of high performance, scalable, global and regional load balancers.
  • Virtual Private Cloud (VPC): A virtual system that provides global, scalable networking functionality for your Google Cloud workloads.
  • Google Cloud Armor: A network security service that offers web application firewall (WAF) rules and helps to protect against DDoS and application attacks.
  • Cloud NAT: A service that provides Google Cloud-managed high-performance network address translation.
  • Cloud Monitoring: A service that provides visibility into the performance, availability, and health of your applications and infrastructure.
  • Cloud Interconnect: A service that extends your external network to the Google network through a high-availability, low-latency connection.
  • Partner Interconnect: A service that provides connectivity between your on-premises network and your Virtual Private Cloud networks and other networks through a supported service provider.
  • Cloud VPN: A service that securely extends your peer network to Google's network through an IPsec VPN tunnel.

This reference architecture uses the following OCI products:

  • Exadata Database Service on Dedicated Infrastructure: A service that lets you run Oracle Database instances on Exadata hardware that's dedicated for you.
  • Object Storage: A service for storing large amounts of structured and unstructured data as objects.
  • VCN and subnets: A VCN is a virtual and private network for resources in an OCI region. A subnet is a contiguous range of IP addresses with a VCN.
  • Dynamic Routing Gateway: A virtual router for traffic between a VCN and external networks.
  • Service Gateway: A gateway to let resources in a VCN access specific Oracle services privately.

Design considerations

This section describes design factors, best practices, and design recommendations that you should consider when you use this reference architecture to develop a topology that meets your specific requirements for security, reliability, operational efficiency, cost, and performance.

The guidance in this section isn't exhaustive. Depending on the specific requirements of your application and the Google Cloud and third-party products and features that you use, there might be additional design factors and trade-offs that you should consider.

System design

This section provides guidance to help you to choose Google Cloud regions for your deployment and to select appropriate Google Cloud services.

Region selection

When you choose the Google Cloud region for your deployment, consider the following factors and requirements:

  • Availability of Google Cloud services in each region. For more information, see Products available by location.
  • Availability of Compute Engine machine types in each region. For more information, see Regions and zones.
  • Availability of Oracle Database@Google Cloud in each region. For more information, see Available configurations.
  • End-user latency requirements.
  • Cost of Google Cloud resources.
  • Regulatory requirements.

Some of these factors and requirements might involve trade-offs. For example, the most cost-efficient region might not have the lowest carbon footprint. For more information, see Best practices for Compute Engine regions selection.

Compute infrastructure

The reference architecture in this document uses Compute Engine VMs to host the web tier and application tier. Depending on the requirements of your application, you can choose the following other Google Cloud compute services:

  • Containers: You can run containerized applications in Google Kubernetes Engine (GKE) clusters. GKE is a container-orchestration engine that automates deploying, scaling, and managing containerized applications.
  • Serverless: If you prefer to focus your IT efforts on your data and applications instead of on setting up and operating infrastructure resources, then you can use serverless services like Cloud Run and Cloud Run functions.

The decision of whether to use VMs, containers, or serverless services involves a trade-off between configuration flexibility and management effort. VMs and containers provide more configuration flexibility and control, but you're responsible for managing the resources. In a serverless architecture, you deploy workloads to a preconfigured platform that requires minimal management effort. The design guidance for those services is outside the scope of this document. For more information about service options, see Hosting Applications on Google Cloud.

Storage options

To provide persistent storage for the Compute Engine VMs in the web tier and application tier, choose an appropriate Persistent Disk or Google Cloud Hyperdisk type based on your application's requirements for capacity, scaling, availability, and performance. For more information, see Storage options.

If you need low-cost storage that's redundant across the zones within a region, use Cloud Storage regional buckets.

To store data that's shared across multiple VMs in a region, like configuration files for all the VMs in the web tier, you can use a Filestore Regional instance. The data that you store in a Filestore Regional instance is replicated synchronously across three zones within the region. This replication helps to ensure high availability and robustness against zone outages for the data that you store in Filestore. You can store shared configuration files, common tools and utilities, and centralized logs in the Filestore instance, and mount the instance on multiple VMs.

When you design storage for your workloads, consider the functional characteristics of the workloads, resilience requirements, performance expectations, and cost goals. For more information, see Design an optimal storage strategy for your cloud workload.

Network design

When you build infrastructure for a multi-tier application stack, you must choose a network design that meets your business and technical requirements. The architecture that's shown in this document uses a simple network topology with a single VPC network. Depending on your requirements, you can choose to use multiple networks. For more information, see the following documentation:

When you assign IP address ranges for the client and backup subnets to be used for the Exadata VM Clusters, consider the minimum subnet size requirements. For more information, see Plan for IP Address Space in Oracle Database@Google Cloud.

Database migration

When you plan to migrate on-premises databases to Oracle Database@Google Cloud, assess your current database environment and get configuration and sizing recommendations by using the Database Migration Assessment (DMA) tool.

To migrate on-premises data to Oracle database deployments in Google Cloud, you can use standard Oracle tools like Oracle GoldenGate.

Before you use the migrated databases in a production environment, verify connectivity from your applications to the databases.

Security

This section describes factors to consider when you use this reference architecture to design a topology in Google Cloud that meets the security and compliance requirements of your workloads.

Protection against external threats

To protect your application against external threats like DDoS attacks and XSS, define appropriate Google Cloud Armor security policies based on your requirements. Each policy is a set of rules that specifies the conditions to be evaluated and actions to take when the conditions are met. For example, a rule could specify that if the source IP address of incoming traffic matches a specific IP address or CIDR range, then the traffic must be denied. You can also apply preconfigured WAF rules. For more information, see Security policy overview.

External access for VMs

In the reference architecture that this document describes, the VMs that host the web tier and the application tier don't need direct inbound access from the internet. Don't assign external IP addresses to those VMs. Google Cloud resources that have only private, internal IP addresses can still access certain Google APIs and services by using Private Service Connect or Private Google Access. For more information, see Private access options for services.

To enable secure outbound connections from Google Cloud resources that have only private IP addresses, like the Compute Engine VMs in this reference architecture, you can use Cloud NAT as shown in the preceding architecture diagram, or use Secure Web Proxy.

For the subnets that are used by the Exadata VMs, Oracle recommends that you assign private IP address ranges.

VM image security

Approved images are images with software that meets your policy or security requirements. To ensure that your VMs in the web tier and application tier use only approved images, you can define an organization policy that restricts the use of images in specific public image projects. For more information, see Setting up trusted image policies.

Service account privileges

In Google Cloud projects where the Compute Engine API is enabled, a default service account is created automatically. For Google Cloud organizations that were created before May 3, 2024, this default service account is granted the Editor IAM role (roles/editor), unless this behavior is disabled.

By default, the default service account is attached to all Compute Engine VMs that you create by using the gcloud CLI or the Google Cloud console. The Editor role includes a broad range of permissions, so attaching the default service account to VMs creates a security risk. To avoid this risk, you can create and use dedicated service accounts for each tier of the application stack. To specify the resources that the service account can access, use fine-grained policies. For more information, see Limit service account privileges.

Network security

To control network traffic between the resources in the web tier and application tier of the architecture, you must configure appropriate Cloud Next Generation Firewall (NGFW) policies.

Database security and compliance

The Exadata Database service includes Oracle Data Safe, which helps you manage security and compliance requirements for Oracle databases. You can use Oracle Data Safe to evaluate security controls, monitor user activity, and mask sensitive data. For more information, see Manage Database Security with Oracle Data Safe.

More security considerations

When you build the architecture for your workload, consider the platform-level security best practices and recommendations that are provided in the Enterprise foundations blueprint.

Reliability

This section describes design factors to consider when you use this reference architecture to build and operate reliable infrastructure for your deployment in Google Cloud.

Robustness against VM failures

In the architecture that's shown in this document, if a Compute Engine VM in the web tier or application tier crashes, the relevant MIG recreates the VM automatically. The load balancers forward requests to only the currently available web server instances and application server instances.

VM autohealing

Sometimes the VMs that host your web tier and application tier might be running and available, but there might be issues with the application itself. The application might freeze, crash, or not have enough memory. In this scenario, the VMs won't respond to load balancer health checks, and the load balancer won't route traffic to the unresponsive VMs. To help ensure that applications respond as expected, you can configure application-based health checks as part of the autohealing policy of your MIGs. If the application on a particular VM isn't responding, the MIG autoheals (repairs) the VM. For more information about configuring autohealing, see About repairing VMs for high availability.

Robustness against region outages

If a region outage occurs, then the application is unavailable. To reduce the downtime caused by region outages, you can implement the following approach:

  • Maintain a passive (failover) replica of the web tier and application tier in another Google Cloud region.
  • Create a standby Exadata Infrastructure instance with the required Exadata VM Clusters in the same region that has the passive replica of the application stack. Use Oracle Data Guard for data replication and automatic failover to the standby Exadata databases. If your application needs a lower recovery point objective (RPO), you can backup and recover the databases by using Oracle Autonomous Recovery Service.
  • If an outage occurs in the primary region, use the database replica or backup to restore the database to production and to activate the application in the failover region.
  • Use DNS routing policies to route traffic to an external load balancer in the failover region.

For business-critical applications that must continue to be available even when a region outage occurs, consider using the multi-regional deployment archetype. You can use Oracle Active Data Guard to provide a read-only standby database in the failover region.

Oracle manages the infrastructure in Oracle Database@Google Cloud. For information about the service level objectives (SLOs) for Oracle Exadata Database Service on Dedicated Infrastructure, see Service Level Objectives for Oracle PaaS and IaaS Public Cloud Services.

MIG autoscaling

The architecture in this document uses regional MIGs for the web tier and application tier. The autoscaling capability of stateless MIGs ensures that the Compute Engine VMs that host the web tier and application tier aren't affected by single-zone outages. Stateful MIGs can't be autoscaled.

To control the autoscaling behavior of your MIGs, you can specify target utilization metrics, such as average CPU utilization. You can also configure schedule-based autoscaling. For more information, see Autoscaling groups of instances.

VM placement

In the architecture that this document describes, the application tier and web tier run on Compute Engine VMs that are distributed across multiple zones. This distribution helps to ensure that your web tier and your application tier are robust against single-zone outages. To improve this robustness further, you can create a spread placement policy and apply it to the MIG template. With a spread placement policy, when the MIG creates VMs, it places them within each zone on different physical servers (called hosts), so your VMs are robust against failures of individual hosts. However, a trade-off with this approach is that the latency for inter-VM network traffic might increase. For more information, see Placement policies overview.

VM capacity planning

To make sure that capacity for Compute Engine VMs is available when required for MIG autoscaling, you can create reservations. A reservation provides assured capacity in a specific zone for a specified number of VMs of a machine type that you choose. A reservation can be specific to a project, or it can be shared across multiple projects. You incur charges for reserved resources even if the resources aren't provisioned or used. For more information about reservations, including billing considerations, see Reservations of Compute Engine zonal resources.

Stateful storage

A best practice in application design is to avoid the need for stateful local disks. But if the requirement exists, you can configure your disks to be stateful to ensure that the data is preserved when the VMs are repaired or recreated. However, we recommend that you keep the boot disks stateless, so that you can update them easily to the latest images with new versions and security patches. For more information, see Configuring stateful persistent disks in MIGs.

Database capacity

You can scale Exadata Infrastructure by adding database servers and storage servers as needed. After you add the required database servers or storage servers to Exadata Infrastructure, to be able to use the additional CPU or storage resources, you must add the capacity to the associated Exadata VM cluster. For more information, see Scaling Exadata Compute and Storage.

Backup and recovery

You can use Backup and DR Service to create, store, and manage backups of Compute Engine VMs. Backup and DR Service stores backup data in its original, application-readable format. When required, you can restore workloads to production by directly using data from long-term backup storage without time-consuming data-movement or preparation activities. For more information, see Backup and DR Service for Compute Engine instance backups.

By default, backups of databases in Oracle Exadata Database Service on Dedicated Infrastructure are stored in OCI Object Storage. To achieve a lower RPO, you can backup and recover the databases by using Oracle Autonomous Recovery Service.

More reliability considerations

When you build the cloud architecture for your workload, review the reliability-related best practices and recommendations that are provided in the following documentation:

Cost optimization

This section provides guidance to optimize the cost of setting up and operating a Google Cloud topology that you build by using this reference architecture.

VM machine types

To help you optimize the utilization of your VMs, Compute Engine provides machine type recommendations. Use the recommendations to choose machine types that match the compute requirements of your web tier and application tier VMs. For workloads that have predictable resource requirements, you can customize the machine type to your needs and save money by using custom machine types.

VM provisioning model

If your application is fault tolerant, then Spot VMs can help to reduce the Compute Engine costs for your VMs in the web tier and application tier. The cost of Spot VMs is significantly lower than regular VMs. However, Compute Engine might preemptively stop or delete Spot VMs to reclaim capacity.

Spot VMs are suitable for batch jobs that can tolerate preemption and don't have high availability requirements. Spot VMs offer the same machine types, options, and performance as regular VMs. However, when the resource capacity in a zone is limited, the MIGs might not be able to scale out (that is, create VMs) automatically to reach the specified target size until the required capacity becomes available again.

VM resource utilization

The autoscaling capability of stateless MIGs enables your application to gracefully handle increases in traffic to the web tier and application tier. Autoscaling also helps you to reduce cost when the need for resources is low. Stateful MIGs can't be autoscaled.

Database costs

When you create an Exadata VM Cluster, you can choose to BYOL or to provision license-included Oracle databases.

Networking charges for data transfer between your applications and Oracle Exadata databases that are within the same region are included in the price of the Oracle Database@Google Cloud offering.

More cost considerations

When you build the architecture for your workload, also consider the general best practices and recommendations that are provided in Google Cloud Architecture Framework: Cost optimization.

Operational efficiency

This section describes the factors to consider when you use this reference architecture to design a Google Cloud topology that you can operate efficiently.

VM configuration updates

To update the configuration of the VMs in a MIG (like the machine type or boot-disk image), you create a new instance template with the required configuration and then apply the new template to the MIG. The MIG updates the VMs by using an update method that you specify: automatic or selective. Choose an appropriate method based on your requirements for availability and operational efficiency. For more information about these MIG update methods, see Apply new VM configurations in a MIG.

VM images

For your MIG instance templates, instead of using Google-provided public images, we recommend that you create and use custom OS images that include the configurations and software that your applications require. You can group your custom images into a custom image family. An image family always points to the most recent image in that family, so your instance templates and scripts can use that image without you having to update references to a specific image version. You must regularly update your custom images to include security updates and patches that are provided by the OS vendor.

Deterministic instance templates

If the instance templates that you use for your MIGs include startup scripts (for example, to install third-party software), make sure that the scripts explicitly specify the software-installation parameters, like the software version. Otherwise, when the MIG creates the VMs, the software that's installed on the VMs might not be consistent. For example, if your instance template includes a startup script to install Apache HTTP Server 2.0 (the apache2 package), then make sure that the script specifies the exact apache2 version that should be installed, such as version 2.4.53. For more information, see Deterministic instance templates.

Database administration

Oracle manages the physical database servers, storage servers, and networking hardware in Oracle Exadata Database Service on Dedicated Infrastructure. You can manage the Exadata Infrastructure instances and the Exadata VM Clusters through the OCI or Google Cloud interfaces. You create and manage databases through the OCI interfaces. The Google Cloud console pages for Oracle Database@Google Cloud include links that you can use to go directly to the relevant pages in the OCI console. To avoid the need to sign in again to OCI, you can configure identity federation between OCI and Google Cloud.

More operational considerations

When you build the architecture for your workload, consider the general best practices and recommendations for operational efficiency that are described in Google Cloud Architecture Framework: Operational excellence.

Performance optimization

This section describes the factors to consider when you use this reference architecture to design a topology in Google Cloud that meets the performance requirements of your workloads.

Compute performance

Compute Engine offers a wide range of predefined and customizable machine types that you can choose from depending on the performance requirements of your workloads. For the VMs that host the web tier and application tier, choose an appropriate machine type based on your performance requirements for those tiers. For more information, see Machine series comparison.

Network performance

For workloads that need low inter-VM network latency within the application and web tiers, you can create a compact placement policy and apply it to the MIG template that's used for those tiers. When the MIG creates VMs, it places the VMs on physical servers that are close to each other. While a compact placement policy helps improve inter-VM network performance, a spread placement policy can help improve VM availability as described earlier. To achieve an optimal balance between network performance and availability, when you create a compact placement policy, you can specify how far apart the VMs must be placed. For more information, see Placement policies overview.

Compute Engine has a per-VM limit for egress network bandwidth. This limit depends on the VM's machine type and whether traffic is routed through the same VPC network as the source VM. For VMs with certain machine types, to improve network performance, you can get a higher maximum egress bandwidth by enabling Tier_1 networking. For more information, see Configure per VM Tier_1 networking performance.

Network traffic between the application tier VMs and the Oracle Exadata network is routed through a low-latency Partner Interconnect connection that Google sets up.

Exadata Infrastructure uses RDMA over Converged Ethernet (RoCE) for high bandwidth and low latency networking among the database servers and storage servers. The servers exchange data directly in main memory without involving the processor, cache, or operating system.

More performance considerations

When you build the architecture for your workload, consider the general best practices and recommendations that are provided in Google Cloud Architecture Framework: Performance optimization.

What's next

Contributors

Author: Kumar Dhanagopal | Cross-Product Solution Developer

Other contributors: