Class DeploymentResourcePool (1.75.0)

DeploymentResourcePool(
    deployment_resource_pool_name: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
)

Retrieves a DeploymentResourcePool.

Parameters

Name Description
deployment_resource_pool_name str

Required. The fully-qualified resource name or ID of the deployment resource pool. Example: "projects/123/locations/us-central1/deploymentResourcePools/456" or "456" when project and location are initialized or passed.

project str

Optional. Project containing the deployment resource pool to retrieve. If not set, the project given to aiplatform.init will be used.

location str

Optional. Location containing the deployment resource pool to retrieve. If not set, the location given to aiplatform.init will be used.

Properties

create_time

Time this resource was created.

display_name

Display name of this resource.

encryption_spec

Customer-managed encryption key options for this Vertex AI resource.

If this is set, then all resources created by this Vertex AI resource will be encrypted with the provided encryption key.

gca_resource

The underlying resource proto representation.

labels

User-defined labels containing metadata about this resource.

Read more about labels at https://goo.gl/xmQnxf

name

Name of this resource.

resource_name

Full qualified resource name.

update_time

Time this resource was last updated.

Methods

create

create(
    deployment_resource_pool_id: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    metadata: typing.Sequence[typing.Tuple[str, str]] = (),
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    machine_type: typing.Optional[str] = None,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    accelerator_type: typing.Optional[str] = None,
    accelerator_count: typing.Optional[int] = None,
    autoscaling_target_cpu_utilization: typing.Optional[int] = None,
    autoscaling_target_accelerator_duty_cycle: typing.Optional[int] = None,
    sync=True,
    create_request_timeout: typing.Optional[float] = None,
    reservation_affinity_type: typing.Optional[str] = None,
    reservation_affinity_key: typing.Optional[str] = None,
    reservation_affinity_values: typing.Optional[typing.List[str]] = None,
    spot: bool = False,
) -> google.cloud.aiplatform.models.DeploymentResourcePool

Creates a new DeploymentResourcePool.

Parameters
Name Description
create_request_timeout float

Optional. The create request timeout in seconds.

reservation_affinity_type str

Optional. The type of reservation affinity. One of NO_RESERVATION, ANY_RESERVATION, SPECIFIC_RESERVATION, SPECIFIC_THEN_ANY_RESERVATION, SPECIFIC_THEN_NO_RESERVATION

reservation_affinity_key str

Optional. Corresponds to the label key of a reservation resource. To target a SPECIFIC_RESERVATION by name, use compute.googleapis.com/reservation-name as the key and specify the name of your reservation as its value.

reservation_affinity_values List[str]

Optional. Corresponds to the label values of a reservation resource. This must be the full resource name of the reservation. Format: 'projects/{project_id_or_number}/zones/{zone}/reservations/{reservation_name}'

spot bool

Optional. Whether to schedule the deployment workload on spot VMs.

deployment_resource_pool_id str

Required. User-specified name for the new deployment resource pool.

project str

Optional. Project containing the deployment resource pool to retrieve. If not set, the project given to aiplatform.init will be used.

location str

Optional. Location containing the deployment resource pool to retrieve. If not set, the location given to aiplatform.init will be used.

metadata Sequence[Tuple[str, str]]

Optional. Strings which should be sent along with the request as metadata.

machine_type str

Optional. Machine type to use for the deployment resource pool. If not set, the default machine type of n1-standard-2 is used.

min_replica_count int

Optional. The minimum replica count of the new deployment resource pool. Each replica serves a copy of each model deployed on the deployment resource pool. If this value is less than max_replica_count, then autoscaling is enabled, and the actual number of replicas will be adjusted to bring resource usage in line with the autoscaling targets.

max_replica_count int

Optional. The maximum replica count of the new deployment resource pool.

accelerator_type str

Optional. Hardware accelerator type. Must also set accelerator_ count if used. One of NVIDIA_TESLA_K80, NVIDIA_TESLA_P100, NVIDIA_TESLA_V100, NVIDIA_TESLA_P4, NVIDIA_TESLA_T4, or NVIDIA_TESLA_A100.

accelerator_count int

Optional. The number of accelerators attached to each replica.

autoscaling_target_cpu_utilization int

Optional. Target CPU utilization value for autoscaling. A default value of 60 will be used if not specified.

autoscaling_target_accelerator_duty_cycle int

Optional. Target accelerator duty cycle percentage to use for autoscaling. Must also set accelerator_type and accelerator count if specified. A default value of 60 will be used if accelerators are requested and this is not specified.

sync bool

Optional. Whether to execute this method synchronously. If False, this method will be executed in a concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

delete

delete(sync: bool = True) -> None

Deletes this Vertex AI resource. WARNING: This deletion is permanent.

list

list(
    filter: typing.Optional[str] = None,
    order_by: typing.Optional[str] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
) -> typing.List[google.cloud.aiplatform.models.DeploymentResourcePool]

Lists the deployment resource pools.

filter (str): Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported. order_by (str): Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: display_name, create_time, update_time project (str): Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used. location (str): Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used. credentials (auth_credentials.Credentials): Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.

query_deployed_models

query_deployed_models(
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
) -> typing.List[google.cloud.aiplatform_v1.types.deployed_model_ref.DeployedModelRef]

Lists the deployed models using this resource pool.

Parameters
Name Description
project str

Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used.

location str

Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.

to_dict

to_dict() -> typing.Dict[str, typing.Any]

Returns the resource proto as a dictionary.

wait

wait()

Helper method that blocks until all futures are complete.