Reserve capacity

This document explains how to reserve blocks of capacity by asking your account team to create a future reservation request for you. For other ways to get compute resources in AI Hypercomputer, see Choose a consumption option.

For a very high assurance that your workloads have the resources they need, request a future reservation from Google. This action lets you reserve blocks of capacity for a defined duration, starting on a specific date and time that you choose. Based on your request, Google creates a draft future reservation request. After you review and submit this draft request, and Google Cloud approves it, Compute Engine automatically creates (auto-creates) an empty reservation. Then, at your chosen start time, Compute Engine provisions your requested capacity into the auto-created reservation. You can then use the reservation to create virtual machine (VM) instances until the reservation period ends.

Limitations

This section describes the limitations for future reservation requests, and for the auto-created reservation for a request.

Limitations for future reservation requests

After Google creates a draft future reservation request for you, the following limitations apply:

  • You can't modify the request details, including the share type.

  • After the request is submitted, approved, and its state changes to PROVISIONING, you can't cancel or delete it. You commit to pay for the requested capacity from the request's start time, regardless of usage.

Limitations for auto-created reservations

After Compute Engine creates an on-demand reservation to fulfill your requested capacity, the following limitations apply:

Before you begin

Select the tab for how you plan to use the samples on this page:

Console

When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.

gcloud

    In the Google Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

REST

To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.

    After installing the Google Cloud CLI, initialize it by running the following command:

    gcloud init

    If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.

For more information, see Authenticate for using REST in the Google Cloud authentication documentation.

Required roles

To get the permissions that you need to create a future reservation request, ask your administrator to grant you the Compute Future Reservation User (roles/compute.futureReservationUser) IAM role on the project. For more information about granting roles, see Manage access to projects, folders, and organizations.

This predefined role contains the permissions required to create a future reservation request. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create a future reservation request:

  • To allow Compute Engine to auto-create reservations: compute.reservations.create on the project
  • To create a future reservation request: compute.futureReservations.create on the project
  • To specify an instance template: compute.instanceTemplates.useReadOnly on the instance template

You might also be able to get these permissions with custom roles or other predefined roles.

Quota

As part of the future reservation request process, Google manages quota for your reserved resources. You don't need to request quota. At the start time of your approved future reservation, Google increases your quota if you lack it for the reserved resources.

Overview

To reserve blocks of capacity, complete the following steps:

  1. Request capacity through your account team. Contact your account team to specify the type and number of resources that you want to reserve.

  2. Review and submit a draft reservation request. After Google creates a draft future reservation request, review it. If it looks correct, then submit the request for review. Google Cloud immediately approves it.

Request capacity through your account team

Contact your account team and provide the following information for Google to create a draft future reservation request:

  • Project number: the number of the project where your account team creates the request and Compute Engine provisions the capacity.

  • Machine type: whether you want to reserve A4 (a4-highgpu-8g) or A3 Ultra (a3-ultragpu-8g) machine types.

  • Total count: the total number of VMs to reserve. You can only reserve multiples of two VMs. Block sizes and VM count per block vary based on machine type and availability. Your account team can provide more details for your request.

  • Zone: the zone where you want to reserve capacity. To review the available regions and zones for a GPU machine type, see GPU regions and zones.

  • Start time: the start time of the reservation period. You can start using the reserved capacity then. Format the start time as a RFC 3339 timestamp:

    YYYY-MM-DDTHH:MM:SSOFFSET
    

    Replace the following:

    • YYYY-MM-DD: a date formatted as a four-digit year, two-digit month, and a two-digit day of the month, separated by hyphens (-).

    • HH:MM:SS: a time formatted as a two-digit hour by using a 24-hour time, two-digit minutes, and two-digit seconds, separated by colons (:).

    • OFFSET: the time zone formatted as an offset of Coordinated Universal Time (UTC). For example, to use the Pacific Standard Time (PST), specify -08:00. To use no offset, specify Z.

  • End time: the end time of the reservation period. Format it as an RFC 3339 timestamp. At this time, Compute Engine does the following:

    • Compute Engine deletes the auto-created reservation.

    • Based on the termination action that you specified for the VMs, Compute Engine stops or deletes any VMs that use the reservation.

  • Share type: whether only your project can use the auto-created reservation (LOCAL), or other projects can use the reservation (SPECIFIC_PROJECTS). This property can't change after you submit the request. To share reserved capacity with other projects in your organization, do the following:

    1. If you haven't already, then verify that the project where Google creates the request is allowed to create shared reservations.

    2. Provide the numbers of the projects to share the reserved capacity with. You can specify up to 100 projects in your organization.

  • Reservation name: the name of the reservation that Compute Engine automatically creates to deliver your reserved capacity. Compute Engine only creates specifically targeted reservations.

  • Commitment name: if your reservation period is one year or longer, then you must purchase and attach a resource-based commitment to your reserved resources. You can purchase a commitment with a 1-year or 3-year plan. If you share the reserved capacity with other projects, then those projects get discounts only if they use the same Cloud Billing account as the project where you reserve capacity. For details, see Enable CUD sharing for resource-based commitments.

When Google creates the draft future reservation request, your account team contacts you.

Review and submit a draft reservation request

After you provide the type and amount of resources to reserve to your account team, Google creates a draft future reservation request. You can review the draft request and, if correct, submit it for review. You must submit the request before the request start time.

To review and submit a draft future reservation request, select one of the following options:

Console

  1. In the Google Cloud console, go to the Reservations page.

    Go to Reservations

  2. Click the Future reservations tab. The Future Reservations table lists each future reservation request in your project, and each table column describes a property.

  3. In the Name column, click the name of the draft request that Google created for you. A page that gives the details of the future reservation request opens.

  4. In the Basic information section, verify that the request details, such as Dates and Share type, are correct. Also, if you requested a commitment, verify that it's specified. If any of these details are incorrect, then contact your account team.

  5. If everything looks accurate, click Submit. Google Cloud approves your request within a few minutes, and then Compute Engine creates an empty reservation with your requested resources.

gcloud

  1. To view a list of future reservation requests in your project, use the gcloud beta compute future-reservations list command with the --filter flag set to PROCUREMENT_STATUS=DRAFTING:

    gcloud beta compute future-reservations list --filter=PROCUREMENT_STATUS=DRAFTING
    
  2. In the command output, look for the reservation request that has the name that you provided to your account team.

  3. To view the details of the draft request, use the gcloud beta compute future-reservations describe command:

    gcloud beta compute future-reservations describe FUTURE_RESERVATION_NAME \
        --zone=ZONE
    

    Replace the following:

    • FUTURE_RESERVATION_NAME: the name of the draft future reservation request.

    • ZONE: the zone where Google created the request.

  4. In the command output, verify that the request details, such as the reservation period and share type, are correct. Additionally, if you purchased a commitment, verify that it's specified. If the details are incorrect, then contact your account team.

  5. To submit the draft request for review, use the gcloud beta compute future-reservations update command with the --planning-status flag set to SUBMITTED:

    gcloud beta compute future-reservations update FUTURE_RESERVATION_NAME \
        --planning-status=SUBMITTED \
        --zone=ZONE
    

    Within a few minutes, Google Cloud approves your request, and then Compute Engine creates an empty reservation with your requested resources.

REST

  1. To view a list of future reservation requests in your project, make a GET request to the beta futureReservations.list method. In the request URL, include the filter query parameter and set it to status.procurementStatus=DRAFTING:

    GET https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/futureReservations?filter=status.procurementStatus=DRAFTING
    

    Replace the following:

    • PROJECT_ID: the ID of the project where Google created the draft future reservation request.

    • ZONE: the zone where request exists.

  2. In the request output, look for the reservation request that has the name that you provided to your account team.

  3. To view the details of the draft request, make a GET request to the beta futureReservations.get method:

    GET https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME
    

    Replace FUTURE_RESERVATION_NAME with the name of the draft future reservation request.

  4. In the response, verify that the request details, such as the reservation period and share type, are correct. Additionally, if you requested a commitment, verify that it's specified. If the details are incorrect, then contact your account team.

  5. To submit the draft request for review, make a PATCH request to the beta futureReservations.update method. In the request URL, include the updateMask query parameter and set it to planningStatus:

    PATCH https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME?updateMask=planningStatus
    
    {
      "name": "FUTURE_RESERVATION_NAME",
      "planningStatus": "SUBMITTED"
    }
    

    Within a few minutes, Google Cloud approves your request, and then Compute Engine creates an empty reservation with your requested resources.

What's next