RDMA RoCE network profile
This page provides an overview of the Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE) network profile in Google Cloud.
Overview
The RDMA RoCE network profile lets you create a Virtual Private Cloud (VPC) network that provides low-latency, high-bandwidth RDMA communication between the GPUs of VMs that are created in the network by using the RoCE v2 protocol. A VPC network that uses the RoCE network profile is called an RoCE VPC network.
RoCE VPC networks are useful for running AI workloads. For more information about running AI workloads in Google Cloud, see AI Hypercomputer overview.
The resource name of an RoCE network profile has the following format
ZONE-vpc-roce
—for example europe-west1-b-vpc-roce
.
To view specific network profile names, see
List network profiles.
Supported zones
The RoCE network profile is available in the following zones:
europe-west1-b
us-central1-a
us-central1-b
us-east4-b
us-west1-c
You can only create an RoCE VPC network in a zone where an RoCE network profile is available.
Specifications
RoCE VPC networks have the following specifications:
NVIDIA ConnectX NICs: NVIDIA ConnectX NICs appear as
MRDMA
network interfaces in Google Cloud.Zonal constraint: resources using an RoCE VPC network are limited to the same zone as the RoCE network profile associated with the RoCE VPC network during the RoCE network creation. This zonal limit has the following effects:
All instances that have network interfaces in an RoCE VPC network must be created in the zone that matches the zone of the RoCE network profile used by the RoCE VPC network.
All subnets created in an RoCE VPC network must be located in the region that contains the zone of the RoCE network profile used by the RoCE VPC network.
MRDMA network interfaces only: RoCE VPC networks only support
MRDMA
network interfaces (NICs), which are only available on the A3 Ultra, A4, and A4X machine series.All non-MRDMA NICs of a virtual machine (VM) must be attached to a regular VPC network.
8896 byte default MTU: the default maximum transmission unit (MTU) of an RoCE VPC network is
8896
bytes. This allows the RDMA driver in the VM's guest operating system to use smaller MTUs if needed. For best performance, we recommend that you not change the default MTU.Firewall differences: RoCE VPC networks use different implied firewall rules. They only support regional network firewall policies that have an RoCE firewall policy type. The set of parameters for rules within a supported regional network firewall policy are limited. For more information, see Cloud NGFW for RoCE VPC networks.
No VPC Flow Logs support: RoCE VPC networks don't support VPC Flow Logs, even if you enable VPC Flow Logs for a subnet in an RoCE VPC network.
No Connectivity Tests support: Connectivity Tests doesn't support RoCE VPC networks.
Other VPC features: RoCE VPC networks support a limited set of other VPC features. For more information, see the following Supported and unsupported features section.
Supported and unsupported features
The following table lists which VPC features are supported by RoCE VPC networks.
Feature | Supported | Network profile property | Network profile property value | Details |
---|---|---|---|---|
MRDMA NICs |
interfaceTypes |
MRDMA |
RoCE VPC networks only support |
|
Multi-NIC in the same network | allowMultiNicInSameNetwork |
MULTI_NIC_IN_SAME_NETWORK_ALLOWED |
RoCE VPC networks support
multi-NIC VMs, allowing
two or more |
|
IPv4-only subnets | subnetworkStackTypes |
SUBNET_STACK_TYPE_IPV4_ONLY |
RoCE VPC networks support IPv4-only subnets, including the same Valid IPv4 ranges as regular VPC networks. RoCE VPC networks don't support dual-stack or IPv6-only subnets. For more information, see Types of subnets. |
|
PRIVATE subnet purpose |
subnetworkPurposes |
SUBNET_PURPOSE_PRIVATE |
RoCE VPC networks support regular subnets, which have a
RoCE VPC networks don't support Private Service Connect subnets, proxy-only subnets, or Private NAT subnets. For more information, see Purposes of subnets. |
|
GCE_ENDPOINT address purpose |
addressPurposes |
GCE_ENDPOINT |
RoCE VPC networks support IP addresses with a
RoCE VPC networks don't support special purpose IP addresses,
such as the |
|
Attachments from nic0 |
allowDefaultNicAttachment |
DEFAULT_NIC_ATTACHMENT_BLOCKED |
RoCE VPC networks don't support attaching the nic0
network interfaces of a VM to the network. Each MRDMA NIC attached
to an RoCE VPC network must not be nic0 . |
|
External IP addresses for VMs | allowExternalIpAccess |
EXTERNAL_IP_ACCESS_BLOCKED |
RoCE VPC networks don't support assigning
external IP addresses to
MDRMA VM NICs. Consequently, MDRMA VM NICs don't
have internet access. |
|
Dynamic Network Interfaces | allowSubInterfaces |
SUBINTERFACES_BLOCKED |
RoCE VPC networks don't support Dynamic NICs. | |
Alias IP ranges | allowAliasIpRanges |
ALIAS_IP_RANGE_BLOCKED |
RoCE VPC networks don't support assigning
alias IP ranges to MRDMA
NICs. |
|
IP forwarding | allowIpForwarding |
IP_FORWARDING_BLOCKED |
RoCE VPC networks don't support IP forwarding. | |
VM network migration | allowNetworkMigration |
NETWORK_MIGRATION_BLOCKED |
RoCE VPC networks don't support migrating VM NICs between networks. | |
Auto mode | allowAutoModeSubnet |
AUTO_MODE_SUBNET_BLOCKED |
RoCE VPC networks can't be auto mode networks. For more information, see subnet creation mode. | |
VPC Network Peering | allowVpcPeering |
VPC_PEERING_BLOCKED |
RoCE VPC networks don't support connecting to other VPC networks using VPC Network Peering. Consequently, RoCE VPC networks don't support connecting to services using private services access. | |
Static routes | allowStaticRoutes |
STATIC_ROUTES_BLOCKED |
RoCE VPC networks don't support static routes. | |
Packet Mirroring | allowPacketMirroring |
PACKET_MIRRORING_BLOCKED |
RoCE VPC networks don't support Packet Mirroring. | |
Cloud NAT | allowCloudNat |
CLOUD_NAT_BLOCKED |
RoCE VPC networks don't support Cloud NAT. | |
Cloud Router | allowCloudRouter |
CLOUD_ROUTER_BLOCKED |
RoCE VPC networks don't support Cloud Routers and dynamic routes. | |
Cloud Interconnect | allowInterconnect |
INTERCONNECT_BLOCKED |
RoCE VPC networks don't support Cloud Interconnect VLAN attachments. | |
Cloud VPN | allowVpn |
VPN_BLOCKED |
RoCE VPC networks don't support Cloud VPN tunnels. | |
Network Connectivity Center | allowNcc |
NCC_BLOCKED |
RoCE VPC networks don't support Network Connectivity Center. You can't add an RoCE VPC network as a VPC spoke to a Network Connectivity Center hub. | |
Cloud Load Balancing | allowLoadBalancing |
LOAD_BALANCING_BLOCKED |
RoCE VPC networks don't support Cloud Load Balancing. Consequently, RoCE VPC networks don't support load balancer features, including Google Cloud Armor. | |
Private Google Access | allowPrivateGoogleAccess |
PRIVATE_GOOGLE_ACCESS_BLOCKED |
RoCE VPC networks don't support Private Google Access. | |
Private Service Connect | allowPsc |
PSC_BLOCKED |
RoCE VPC networks don't support Private Service Connect. |
RoCE VPC network multi-NIC considerations
To support workloads that benefit from cross-rail GPU-to-GPU communication, RoCE
VPC networks support VMs that have multiple MRDMA
NICs in the
network. Each MRDMA
NIC must be in a unique subnet. Placing two or more
MRDMA
NICs in the same RoCE VPC network might affect network
performance, including increased latency. MRDMA
NICs use
NCCL. NCCL attempts to align all network
transfers, even for cross-rail communication. For example, it uses PXN to copy
data through NVlink to a rail-aligned GPU before transferring it over the
network.
What's next
- Network profiles for specific use cases
- Create a VPC network for RDMA NICs
- Cloud NGFW for RoCE VPC networks