About Hyperdisk ML


This document describes the features of Hyperdisk ML, which offers the highest throughput of all Google Cloud Hyperdisk types. Google recommends using Hyperdisk ML for machine learning and for workloads that require high read throughput on immutable datasets. The high throughput Hyperdisk ML provides results in faster data load times, shorter accelerator idle times and lower compute costs.

For large inference, training and HPC workloads, you can attach a single Hyperdisk ML volume to multiple compute instances in read-only mode.

You can specify up to 19,200,000 MiB/s of throughput for a single Hyperdisk ML volume. You can't provision an IOPS level, but each MiB/s of provisioned throughput comes with 16 IOPS, up to 19,200,000 IOPS.

For more information about Hyperdisk and the other Hyperdisk types, see About Hyperdisk.

To create a new Hyperdisk ML volume, see Create a Hyperdisk volume.

Use cases

Hyperdisk ML is a good fit for the following use cases:

  • HPC workloads
  • Machine learning
  • Accelerator-optimized workloads

Machine series support

You can use Hyperdisk ML with the following machine series:

About provisioned performance

You don't have to provision performance when you create Hyperdisk volumes. If you don't provision performance, Compute Engine creates the volume with default values that you can modify later. For details about default values, see Default IOPS and throughput values.

If you know your performance needs, you can specify IOPS and throughput limits for a Hyperdisk ML volume when you create the volume, and you can change the provisioned values after you create the volume. You can't specify an IOPS or throughput level if you don't specify a size.

Size and performance limits

The following limits apply to the size, throughput, and IOPS values you can specify for a Hyperdisk ML volume.

  • Size: between 4 GiB and 64 TiB. The default size is 100 GiB.

  • Throughput: between 400 MiB/s and 1,200,000 MiB/s, with the following restrictions:

    • Minimum throughput: for volumes 6-3,341 GiB, the minimum throughput is 400 MiB/s. For volumes greater than 3,342 GiB, the minimum throughput ranges from 401-7,680 MiB/s.
    • Maximum throughput: for volumes greater than 750 GiB, the maximum throughput is 1,200,000 MiB/s. For volumes less 4-749 GiB, the maximum throughput varies by size. For examples, see Limits for provisioned throughput.
  • IOPS: you can't specify an IOPS limit for Hyperdisk ML volumes. Instead, the provisioned IOPS depends on the provisioned throughput. Each Hyperdisk ML volume is provisioned with 16 IOPS for each MiB/s of throughput, up to a maximum of 19,200,000 IOPS.

Limits for provisioned throughput

The following table lists the limits for provisioned throughput for common volume sizes. If a size isn't listed, use the following formula to calculate the allowable values, where x is the volume's size in GiB:

  • Minimum configurable throughput: MAX (400, 0.12x)
  • Maximum configurable throughput: MIN (1200000, 1600x)
Size Min throughput Max throughput
4 400 6,400
10 400 16,000
50 400 80,000
64 400 102,400
100 400 160,000
300 400 480,000
500 400 800,000
1,000 400 1200000
5,000 600 1200000
25,000 3,000 1200000
64,000 7,680 1200000

Default IOPS and throughput values

If you don't specify an IOPS or throughput limit when you create a Hyperdisk ML volume, Compute Engine assigns default values. The assigned value is based on the following formulas, where x is the volume's size in GiB.

  • Default throughput: MAX (24x, 400) MiB/s
  • Default size: 100 GiB

Change the provisioned performance or size

You can change the provisioned size every 4 hours and its throughput every 6 hours. For instructions on modifying size or performance, see Modify a Hyperdisk volume.

Performance limits when attached to an instance

This section lists the performance limits for Hyperdisk ML. You can specify up to 19,200,000 MiB/s of throughput for a single Hyperdisk ML volume. You can't provision an IOPS level, but each MiB/s of provisioned throughput comes with 16 IOPS, up to 19,200,000 IOPS.

The following table lists the maximum performance that Hyperdisk ML volumes can achieve for each supported instance. A Hyperdisk ML volume's performance when it's attached to an instance can't exceed the limits for the instance's machine type. The performance limits are also shared across all Hyperdisk ML volumes attached to the same instance, regardless of each volume's provisioned performance.

Scenarios that require multiple instances to reach provisioned performance

The provisioned throughput for a Hyperdisk ML volume is shared between each instance the volume is attached to, up to the maximum limit for the machine type that's listed in the following table. If a Hyperdisk ML volume's provisioned performance is higher than an instance's performance limit, the volume can achieve its provisioned performance only if it is attached to multiple instances. a3-ultragpu-8 instances have a throughput limit of 4,000 MiB/s.

For example, suppose you have a Hyperdisk ML volume provisioned with 500,000 MiB/s of throughput. and you want to attach the volume to a3-ultragpu-8 instances. A single a3-ultragpu-8 instance can't acheieve more than 4,000 MiB/s of throughput. Therefore, to achieve the volume's provisioned throughput, you must attach the volume to at least 125 (500,000/4,000) a3-ultragpu-8 instances. On the other hand, for the a2-highgpu-1g machine type, you would need 272 instances.

Instance machine type Maximum IOPS Maximum throughput (MiB/s)
a2-*-1g 1,800 28,800
a2-*-2g 2,400 38,400
a2-*-4g 2,400 38,400
a2-*-8g 2,400 38,400
a2-megagpu-16g 2,400 38,400
a3-*-1g 1,800 28,800
a3-*-2g 2,400 38,400
a3-*-4g 2,400 38,400
a3-*-8g (in read-only mode)1 4,000 64,000
a3-*-8g (in read-write mode)1 2,400 38,400
c3-*-4 400 6,400
c3-*-8 800 12,800
c3-*-22 1,800 28,800
c3-*-44 2,400 38,400
c3-*-88 2,400 38,400
c3-*-176 2,400 38,400
c3-*-192 2,400 38,400
c3d-*-4 400 6,400
c3d-*-8 800 12,800
c3d-*-16 1,200 19,200
c3d-*-30 1,200 19,200
c3d-*-60 2,400 38,400
c3d-*-90 2,400 38,400
c3d-*-180 2,400 38,400
c3d-*-360 2,400 38,400
ct6e-standard-1t 1,200 19,200
ct6e-standard-4t 1,800 28,800
ct6e-standard-8t 1,800 28,800
g2-standard-4 800 12,800
g2-standard-8 1,200 19,200
g2-standard-12 1,800 28,800
g2-standard-16 2,400 38,400
g2-standard-24 2,400 38,400
g2-standard-32 2,400 38,400
g2-standard-48 2,400 38,400
g2-standard-96 2,400 38,400

1 For a3-*-8g instances, performance depends on whether the Hyperdisk ML volume is attached to the instance in read-only or read-write mode.

Regional availability for Hyperdisk ML

Hyperdisk ML is available in the following regions and zones:

Region Available Zones
Changhua County, Taiwan—asia-east1 asia-east1-a
asia-east1-b
asia-east1-c
Tokyo, Japan—asia-northeast1 asia-northeast1-a
asia-northeast1-b
asia-northeast1-c
Seoul, South Korea—asia-northeast3 asia-northeast3-a
asia-northeast3-b
Jurong West, Singapore—asia-southeast1 asia-southeast1-a
asia-southeast1-b
asia-southeast1-c
Mumbai, India—asia-south1 asia-south1-b
asia-south1-c
St. Ghislain, Belgium—europe-west1 europe-west1-b
europe-west1-c
London, England—europe-west2 europe-west2-a
europe-west2-b
europe-west3-b
Eemshaven, Netherlands—europe-west4 europe-west4-a
europe-west4-b
europe-west4-c
Zurich, Switzerland—europe-west6 europe-west6-b
europe-west6-c
Tel Aviv, Israel—me-west1 me-west1-b
me-west1-c
Council Bluffs, Iowa—us-central1 us-central1-a
us-central1-b
us-central1-c
us-central1-f
Moncks Corner, South Carolina—us-east1 us-east1-b
us-east1-c
us-east1-d
Ashburn, Virginia—us-east4 us-east4-a
us-east4-b
us-east4-c
Columbus, Ohio—us-east5 us-east5-a
us-east5-b
us-east5-c
Dallas, Texas—us-south1 us-south1-a
The Dalles, Oregon—us-west1 us-west1-a
us-west1-b
us-west1-c
Salt Lake City, Utah—us-west3 us-west3-b
Las Vegas, Nevada—us-west4 us-west4-a
us-west4-b
us-west4-c

Disaster protection for Hyperdisk ML volumes

You can back up a Hyperdisk ML volume with standard snapshots. Snapshots back up the data on a Hyperdisk ML volume at a specific point in time.

Cross-zonal replication

You can't replicate Hyperdisk ML volumes to another zone. To replicate data to another zone within the same region, you must use Hyperdisk Balanced High Availability volumes.

Share a Hyperdisk ML volume between VMs

For accelerator-optimized machine learning workloads, you can attach the same Hyperdisk ML volume to multiple instances. This enables concurrent read-only access to a single volume from multiple VMs. This is more cost effective than having multiple disks with the same data.

There are no additional costs associated with sharing a disk between VMs. Attaching a disk in read-only mode to multiple VMs doesn't affect the disk's performance. Each VM can still reach the maximum disk performance possible for the VM's machine series.

Limitations for sharing Hyperdisk ML between VMs

  • Hyperdisk ML volumes don't support multi-writer mode; you can share a Hyperdisk ML volume among multiple instances if the volume is in read-only mode.
  • Hyperdisk ML volumes can't be attached to single instance in read-only mode.
  • If you share a Hyperdisk ML volume in read-only mode, you can't re-enable write access to the disk.
  • You can attach a Hyperdisk ML volume to up to 100 instances during every 30-second interval.
  • For Hyperdisk ML volumes, the maximum number of instances depends on
  • the provisioned size, as follows:
    • Volumes less than 256 GiB in size: 2,500 VMs
    • Volumes with capacity of 256 GiB or more, and less than 1 TiB: 1,500 VMs
    • Volumes with capacity of 1 TiB or more, and less than 2 TiB: 600 VMs
    • Volumes with 2 TiB or more of capacity: 30 VMs

If the volume is attached to more than 20 VMs, then you must provision at least 100 MiB/s of throughput for each VM. For example, if you attach a disk to 500 VMs, you must provision the volume with at least 50,000 MiB/s of throughput.

To learn more, see Read-only mode for Hyperdisk.

Pricing

You're billed for the total provisioned throughput of your Hyperdisk ML volumes until you delete them. Charges incur even if the volume isn't attached to any instances or if the instance is suspended or stopped. For more information, see Disk pricing.

Limitations

  • Hyperdisk ML volumes are zonal and can only be accessed from the zone where you created the volume.
  • You can't create a machine image from a Hyperdisk volume.
  • You can't create an instant snapshot from a Hyperdisk ML volume.
  • You can't use Hyperdisk ML as boot disks.
  • You can't create a Hyperdisk ML disk in read-write-single mode from a snapshot or a disk image. You must create the disk in read-only-many mode.
  • You can change a Hyperdisk ML volume's size every 4 hours, and its throughput every 6 hours.

What's next