RDMA network profiles

This page provides an overview of Remote Direct Memory Access (RDMA) network profiles in Google Cloud.

Overview

RDMA network profiles let you create Virtual Private Cloud (VPC) networks that provide low-latency, high-bandwidth RDMA communication between the memory or GPUs of VMs that are created in the network.

RDMA network profiles are useful for running AI workloads. For more information about running AI workloads in Google Cloud, see AI Hypercomputer overview.

You can create the following types of VPC networks by using RDMA network profiles:

VPC network type Network profile resource name Protocol Supported NIC type Supported machine types
Falcon VPC network (Preview) ZONE-vpc-falcon RDMA over Falcon transport IRDMA
RoCE VPC network ZONE-vpc-roce RDMA over converged ethernet v2 (RoCE v2) MRDMA

Supported zones

RDMA network profiles are available in a limited set of zones. You can only create a Cloud RDMA or RoCE VPC network in a zone where the corresponding network profile is available.

To view the supported zones, see List network profiles.

Alternatively, you can view the supported zones for the machine type that you intend to create in the network. RDMA network profiles are available in the same zones as their supported machine types. For more information, see the following:

Specifications

VPC networks created with an RDMA network profile have the following specifications:

  • Zonal constraint. Resources using a VPC network with an RDMA network profile are limited to the same zone as the RDMA network profile associated with the VPC network during the network creation. This zonal limit has the following effects:

    • All instances that have network interfaces in the VPC network must be created in the zone that matches the zone of the RDMA network profile used by the VPC network.

    • All subnets created in the VPC network must be located in the region that contains the zone of the RDMA network profile used by the VPC network.

  • RDMA network interfaces only. A VPC network with an RDMA network profile supports attachments only from specific network interfaces:

    • Falcon VPC networks only support IRDMA network interfaces (NICs), which are only available on the H4D machine series.
    • RoCE VPC networks only support MRDMA NICs, which are only available on the A3 Ultra, A4, and A4X machine series.

    All non-RDMA NICs of a virtual machine (VM) instance must be attached to a regular VPC network.

  • 8896 byte MTU. For best performance, we recommend a maximum transmission unit (MTU) of 8896 bytes for VPC networks with an RDMA network profile. This allows the RDMA driver in the VM's guest operating system to use smaller MTUs if needed.

    If you create a VPC network with an RDMA network profile by using the gcloud CLI or the API, then 8896 bytes is the default MTU. If you use the Google Cloud console, then you must set the MTU to 8896.

  • Firewall differences. See the following information about firewall differences in VPC networks with an RDMA network profile:

    • VPC networks with an RDMA network profile use the following implied firewall rules, which are different from the implied firewall rules used by regular VPC networks:

      • Implied allow egress
      • Implied allow ingress
    • Cloud NGFW support depends on the type of VPC network:

      • RoCE VPC networks only support regional network firewall policies that have an RoCE firewall policy type. The set of parameters for rules within a supported regional network firewall policy are limited. For more information, see Cloud NGFW for RoCE VPC networks.
      • Falcon VPC networks don't support configuring Cloud NGFW rules or policies.
  • No Connectivity Tests support. Connectivity Tests doesn't support VPC networks with an RDMA network profile.

  • Other VPC features. VPC networks with an RDMA network profile support a limited set of other VPC features. For more information, see the following Supported and unsupported features section.

Supported and unsupported features

The following table lists which VPC features are supported by VPC networks with an RDMA network profile.

Feature Supported Network profile property Network profile property value Details
RDMA NICs interfaceTypes MRDMA or IRDMA

VPC networks with an RDMA network profile support only the NIC type that corresponds to the RDMA network profile:

  • IRDMA for Falcon VPC networks
  • MRDMA for RoCE VPC networks

Other NIC types, such as GVNIC or VIRTIO_NET, aren't supported.

Multi-NIC in the same network allowMultiNicInSameNetwork MULTI_NIC_IN_SAME_NETWORK_ALLOWED

VPC networks with an RDMA network profile support multi-NIC VMs, allowing two or more RDMA NICs of the same VM to be in the same VPC network. Each NIC must attach to a unique subnet in the VPC network.

See also RoCE VPC network multi-NIC considerations.

IPv4-only subnets subnetworkStackTypes SUBNET_STACK_TYPE_IPV4_ONLY

VPC networks with an RDMA network profile support IPv4-only subnets, including the same Valid IPv4 ranges as regular VPC networks.

VPC networks with an RDMA network profile don't support dual-stack or IPv6-only subnets. For more information, see Types of subnets.

PRIVATE subnet purpose subnetworkPurposes SUBNET_PURPOSE_PRIVATE

VPC networks with an RDMA network profile support regular subnets, which have a purpose attribute value of PRIVATE.

VPC networks with an RDMA network profile don't support Private Service Connect subnets, proxy-only subnets, or Private NAT subnets. For more information, see Purposes of subnets.

GCE_ENDPOINT address purpose addressPurposes GCE_ENDPOINT

VPC networks with an RDMA network profile support IP addresses with a purpose attribute value of GCE_ENDPOINT, which is used by internal IP addresses of VM NICs.

VPC networks with an RDMA network profile don't support special purpose IP addresses, such as the SHARED_LOADBALANCER_VIP purpose. For more information, see the addresses resource reference.

Attachments from nic0 allowDefaultNicAttachment DEFAULT_NIC_ATTACHMENT_BLOCKED VPC networks with an RDMA network profile don't support attaching the nic0 network interfaces of a VM to the network. Each RDMA NIC attached to the VPC network must not be nic0.
External IP addresses for VMs allowExternalIpAccess EXTERNAL_IP_ACCESS_BLOCKED VPC networks with an RDMA network profile don't support assigning external IP addresses to RDMA NICs. Consequently, RDMA NICs don't have internet access.
Dynamic Network Interfaces allowSubInterfaces SUBINTERFACES_BLOCKED VPC networks with an RDMA network profile don't support Dynamic NICs.
Alias IP ranges allowAliasIpRanges ALIAS_IP_RANGE_BLOCKED VPC networks with an RDMA network profile don't support assigning alias IP ranges to RDMA NICs.
IP forwarding allowIpForwarding IP_FORWARDING_BLOCKED VPC networks with an RDMA network profile don't support IP forwarding.
VM network migration allowNetworkMigration NETWORK_MIGRATION_BLOCKED VPC networks with an RDMA network profile don't support migrating VM NICs between networks.
Auto mode allowAutoModeSubnet AUTO_MODE_SUBNET_BLOCKED VPC networks with an RDMA network profile can't be auto mode networks. For more information, see subnet creation mode.
VPC Network Peering allowVpcPeering VPC_PEERING_BLOCKED VPC networks with an RDMA network profile don't support connecting to other VPC networks using VPC Network Peering. Consequently, VPC networks with an RDMA network profile don't support connecting to services using private services access.
Static routes allowStaticRoutes STATIC_ROUTES_BLOCKED VPC networks with an RDMA network profile don't support static routes.
Packet Mirroring allowPacketMirroring PACKET_MIRRORING_BLOCKED VPC networks with an RDMA network profile don't support Packet Mirroring.
Cloud NAT allowCloudNat CLOUD_NAT_BLOCKED VPC networks with an RDMA network profile don't support Cloud NAT.
Cloud Router allowCloudRouter CLOUD_ROUTER_BLOCKED VPC networks with an RDMA network profile don't support Cloud Routers and dynamic routes.
Cloud Interconnect allowInterconnect INTERCONNECT_BLOCKED VPC networks with an RDMA network profile don't support Cloud Interconnect VLAN attachments.
Cloud VPN allowVpn VPN_BLOCKED VPC networks with an RDMA network profile don't support Cloud VPN tunnels.
Network Connectivity Center allowNcc NCC_BLOCKED VPC networks with an RDMA network profile don't support Network Connectivity Center. You can't add a VPC network with an RDMA network profile as a VPC spoke to a Network Connectivity Center hub.
Cloud Load Balancing allowLoadBalancing LOAD_BALANCING_BLOCKED VPC networks with an RDMA network profile don't support Cloud Load Balancing. Consequently, VPC networks with an RDMA network profile don't support load balancer features, including Google Cloud Armor.
Private Google Access allowPrivateGoogleAccess PRIVATE_GOOGLE_ACCESS_BLOCKED VPC networks with an RDMA network profile don't support Private Google Access.
Private Service Connect allowPsc PSC_BLOCKED VPC networks with an RDMA network profile don't support Private Service Connect.

RoCE VPC network multi-NIC considerations

To support workloads that benefit from cross-rail GPU-to-GPU communication, RoCE VPC networks support VMs that have multiple MRDMA NICs in the network. Each MRDMA NIC must be in a unique subnet. Placing two or more MRDMA NICs in the same RoCE VPC network might affect network performance, including increased latency. MRDMA NICs use NCCL. NCCL attempts to align all network transfers, even for cross-rail communication. For example, it uses PXN to copy data through NVlink to a rail-aligned GPU before transferring it over the network.

What's next