About Hyperdisk ML

This document describes the features of Hyperdisk ML, which offers the highest throughput of all Google Cloud Hyperdisk types. Google recommends using Hyperdisk ML for machine learning and for workloads that require high read throughput on immutable datasets. The high throughput Hyperdisk ML provides results in faster data load times, shorter accelerator idle times and lower compute costs.

For large inference, training and HPC workloads, you can attach a single Hyperdisk ML volume to multiple compute instances in read-only mode.

You can specify up to 1,200,000 MiB/s of throughput for a single Hyperdisk ML volume. You can't provision an IOPS level, but each MiB/s of provisioned throughput comes with 16 IOPS, up to 19,200,000 IOPS.

For more information about Hyperdisk and the other Hyperdisk types, see About Hyperdisk.

To create a Hyperdisk ML volume, see Create a Hyperdisk volume.

Use cases

Hyperdisk ML is a good fit for the following use cases:

HPC workloads
Machine learning
Accelerator-optimized workloads

Machine series support

You can use Hyperdisk ML with the following machine series:

About provisioned performance

You don't have to provision performance when you create Hyperdisk volumes. If you don't provision performance, Compute Engine creates the volume with default values that you can modify later. For details about default values, see Default IOPS and throughput values.

If you know your performance needs, you can specify IOPS and throughput limits for a Hyperdisk ML volume when you create the volume, and you can change the provisioned values after you create the volume. You can't specify an IOPS or throughput level if you don't specify a size.

Size and performance limits

The following limits apply to the size, throughput, and IOPS values you can specify for a Hyperdisk ML volume.

Size: between 4 GiB and 64 TiB. The default size is 100 GiB.
Throughput: between 400 MiB/s and 1,200,000 MiB/s. Both the minimum and maximum throughput have their own limits based on the size of the volume, as follows:
- Minimum throughput: for volumes that are 4 to 3,341 GiB in size, the minimum value is 400 MiB/s. For volumes that are 3,342 GiB or greater in size, the minimum value depends on the size and ranges between 401 to 7,680 MiB/s.
- Maximum throughput: for volumes that are 750 GiB or greater in size, the maximum value is 1200000 MiB/s. For volumes that are 749 GiB or smaller in size, the maximum value depends on the size and ranges between 6,400 to 1,199,999 MiB/s.
For examples, see Limits for provisioned throughput.
IOPS: you can't specify an IOPS limit for Hyperdisk ML volumes. Instead, the provisioned IOPS depends on the provisioned throughput. Each Hyperdisk ML volume is provisioned with 16 IOPS for each MiB/s of throughput, up to a maximum of 19,200,000 IOPS.

Limits for provisioned throughput

The following table lists the limits for provisioned throughput for common volume sizes. If a size isn't listed, use the following formula to calculate the allowable values, where x is the volume's size in GiB:

Minimum configurable throughput: MAX (400, 0.12x)
Maximum configurable throughput: MIN (1200000, 1600x)

Size	Min throughput	Max throughput
4	400	6,400
10	400	16,000
50	400	80,000
64	400	102,400
100	400	160,000
300	400	480,000
500	400	800,000
1,000	400	1200000
5,000	600	1200000
25,000	3,000	1200000
64,000	7,680	1200000

Default IOPS and throughput values

If you don't specify an IOPS or throughput limit when you create a Hyperdisk ML volume, Compute Engine assigns default values. The assigned value is based on the following formulas, where x is the volume's size in GiB.

Default throughput: MAX (24x, 400) MiB/s
Default size: 100 GiB

Change the provisioned performance or size

You can change the provisioned size every 4 hours and its throughput every 6 hours. For instructions on modifying size or performance, see Modify a Hyperdisk volume.

Performance limits when attached to an instance

This section lists the performance limits for Hyperdisk ML. You can specify up to 1,200,000 MiB/s of throughput for a single Hyperdisk ML volume. You can't provision an IOPS level, but each MiB/s of provisioned throughput comes with 16 IOPS, up to 19,200,000 IOPS.

The following table lists the maximum performance that Hyperdisk ML volumes can achieve for each supported instance. A Hyperdisk ML volume's performance when it's attached to an instance can't exceed the limits for the instance's machine type. The performance limits are also shared across all Hyperdisk ML volumes attached to the same instance, regardless of each volume's provisioned performance.

Scenarios that require multiple instances to reach provisioned performance

The provisioned throughput for a Hyperdisk ML volume is shared between each instance the volume is attached to, up to the maximum limit for the machine type that's listed in the following table. If a Hyperdisk ML volume's provisioned performance is higher than an instance's performance limit, the volume can achieve its provisioned performance only if it is attached to multiple instances. a3-ultragpu-8 instances have a throughput limit of 4,000 MiB/s.

For example, suppose you have a Hyperdisk ML volume provisioned with 500,000 MiB/s of throughput. and you want to attach the volume to a3-ultragpu-8 instances. A single a3-ultragpu-8 instance can't acheieve more than 4,000 MiB/s of throughput. Therefore, to achieve the volume's provisioned throughput, you must attach the volume to at least 125 (500,000/4,000) a3-ultragpu-8 instances. On the other hand, for the a2-highgpu-1g machine type, you would need 272 instances.

Instance machine type	Maximum IOPS	Maximum throughput (MiB/s)
`a2-*-1g`	28,800	1,800
`a2-*-2g`	38,400	2,400
`a2-*-4g`	38,400	2,400
`a2-*-8g`	38,400	2,400
`a2-megagpu-16g`	38,400	2,400
`a3-*-1g`	28,800	1,800
`a3-*-2g`	38,400	2,400
`a3-*-4g`	38,400	2,400
`a3-*-8g` (in read-only mode)¹	64,000	4,000
`a3-*-8g` (in read-write mode)¹	38,400	2,400
`c3-*-4`	6,400	400
`c3-*-8`	12,800	800
`c3-*-22`	28,800	1,800
`c3-*-44`	38,400	2,400
`c3-*-88`	38,400	2,400
`c3-*-176`	38,400	2,400
`c3-*-192`	38,400	2,400
`c3d-*-4`	6,400	400
`c3d-*-8`	12,800	800
`c3d-*-16`	19,200	1,200
`c3d-*-30`	19,200	1,200
`c3d-*-60`	38,400	2,400
`c3d-*-90`	38,400	2,400
`c3d-*-180`	38,400	2,400
`c3d-*-360`	38,400	2,400
`ct6e-standard-1t`	19,200	1,200
`ct6e-standard-4t`	28,800	1,800
`ct6e-standard-8t`	28,800	1,800
`g2-standard-4`	12,800	800
`g2-standard-8`	19,200	1,200
`g2-standard-12`	28,800	1,800
`g2-standard-16`	38,400	2,400
`g2-standard-24`	38,400	2,400
`g2-standard-32`	38,400	2,400
`g2-standard-48`	38,400	2,400
`g2-standard-96`	38,400	2,400

¹ For a3-*-8g instances, performance depends on whether the Hyperdisk ML volume is attached to the instance in read-only or read-write mode.

Regional availability for Hyperdisk ML

Hyperdisk ML is available in the following regions and zones:

Region	Available Zones
Changhua County, Taiwan—`asia-east1`	`asia-east1-a`
	`asia-east1-b`
	`asia-east1-c`
Tokyo, Japan—`asia-northeast1`	`asia-northeast1-a`
	`asia-northeast1-b`
	`asia-northeast1-c`
Seoul, South Korea—`asia-northeast3`	`asia-northeast3-a`
Seoul, South Korea—`asia-northeast3`	`asia-northeast3-b`
Jurong West, Singapore—`asia-southeast1`	`asia-southeast1-a`
	`asia-southeast1-b`
	`asia-southeast1-c`
Mumbai, India—`asia-south1`	`asia-south1-b`
Mumbai, India—`asia-south1`	`asia-south1-c`
St. Ghislain, Belgium—`europe-west1`	`europe-west1-b`
St. Ghislain, Belgium—`europe-west1`	`europe-west1-c`
London, England—`europe-west2`	`europe-west2-a`
	`europe-west2-b`
	`europe-west3-b`
Eemshaven, Netherlands—`europe-west4`	`europe-west4-a`
	`europe-west4-b`
	`europe-west4-c`
Zurich, Switzerland—`europe-west6`	`europe-west6-b`
Zurich, Switzerland—`europe-west6`	`europe-west6-c`
Tel Aviv, Israel—`me-west1`	`me-west1-b`
Tel Aviv, Israel—`me-west1`	`me-west1-c`
Council Bluffs, Iowa—`us-central1`	`us-central1-a`
	`us-central1-b`
	`us-central1-c`
	`us-central1-f`
Moncks Corner, South Carolina—`us-east1`	`us-east1-b`
	`us-east1-c`
	`us-east1-d`
Ashburn, Virginia—`us-east4`	`us-east4-a`
	`us-east4-b`
	`us-east4-c`
Columbus, Ohio—`us-east5`	`us-east5-a`
	`us-east5-b`
	`us-east5-c`
Dallas, Texas—`us-south1`	`us-south1-a`
The Dalles, Oregon—`us-west1`	`us-west1-a`
	`us-west1-b`
	`us-west1-c`
Salt Lake City, Utah—`us-west3`	`us-west3-b`
Las Vegas, Nevada—`us-west4`	`us-west4-a`
	`us-west4-b`
	`us-west4-c`

Disaster protection for Hyperdisk ML volumes

You can back up a Hyperdisk ML volume with standard snapshots. Snapshots back up the data on a Hyperdisk ML volume at a specific point in time.

Cross-zonal replication

You can't replicate Hyperdisk ML volumes to another zone. To replicate data to another zone within the same region, you must use Hyperdisk Balanced High Availability volumes.

Share a Hyperdisk ML volume between VMs

For accelerator-optimized machine learning workloads, you can attach the same Hyperdisk ML volume to multiple instances. This enables concurrent read-only access to a single volume from multiple VMs. This is more cost effective than having multiple disks with the same data.

There are no additional costs associated with sharing a disk between VMs. Attaching a disk in read-only mode to multiple VMs doesn't affect the disk's performance. Each VM can still reach the maximum disk performance possible for the VM's machine series.

Limitations for sharing Hyperdisk ML between VMs

Hyperdisk ML volumes don't support multi-writer mode; you can share a Hyperdisk ML volume among multiple instances if the volume is in read-only mode.
Hyperdisk ML volumes can't be attached to single instance in read-only mode.
If you share a Hyperdisk ML volume in read-only mode, you can't re-enable write access to the disk.
You can attach a Hyperdisk ML volume to up to 100 instances during every 30-second interval.
For Hyperdisk ML volumes, the maximum number of instances depends on

Volumes less than 256 GiB in size: 2,500 VMs
Volumes with capacity of 256 GiB or more, and less than 1 TiB: 1,500 VMs
Volumes with capacity of 1 TiB or more, and less than 2 TiB: 600 VMs
Volumes with 2 TiB or more of capacity: 30 VMs

If the volume is attached to more than 20 VMs, then you must provision at least 100 MiB/s of throughput for each VM. For example, if you attach a disk to 500 VMs, you must provision the volume with at least 50,000 MiB/s of throughput.

To learn more, see Read-only mode for Hyperdisk.

Pricing

You're billed for the total provisioned size and throughput of your Hyperdisk ML volumes until you delete them. Charges incur even if the volume isn't attached to any instances or if the instance is suspended or stopped. For more information, see Disk pricing.

Limitations

Hyperdisk ML volumes are zonal and can only be accessed from the zone where you created the volume.
You can't create a machine image from a Hyperdisk volume.
You can't create an instant snapshot from a Hyperdisk ML volume.
You can't use Hyperdisk ML as boot disks.
You can't create a Hyperdisk ML disk in read-write-single mode from a snapshot or a disk image. You must create the disk in read-only-many mode.
You can change a Hyperdisk ML volume's size every 4 hours, and its throughput every 6 hours.

What's next

Add a Hyperdisk ML volume to your VM