Class MatchingEngineIndexEndpoint (1.54.0)

MatchingEngineIndexEndpoint(
    index_endpoint_name: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
)

Matching Engine index endpoint resource for Vertex AI.

Properties

create_time

Time this resource was created.

deployed_indexes

Returns a list of deployed indexes on this endpoint.

description

Description of the index endpoint.

display_name

Display name of this resource.

encryption_spec

Customer-managed encryption key options for this Vertex AI resource.

If this is set, then all resources created by this Vertex AI resource will be encrypted with the provided encryption key.

gca_resource

The underlying resource proto representation.

labels

User-defined labels containing metadata about this resource.

Read more about labels at https://goo.gl/xmQnxf

name

Name of this resource.

private_service_access_network

"Private service access network.

private_service_connect_ip_address

"Private service connect ip address.

public_endpoint_domain_name

Public endpoint DNS name.

resource_name

Full qualified resource name.

update_time

Time this resource was last updated.

Methods

MatchingEngineIndexEndpoint

MatchingEngineIndexEndpoint(
    index_endpoint_name: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
)

Retrieves an existing index endpoint given a name or ID.

Example Usage:

my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name='projects/123/locations/us-central1/index_endpoint/my_index_id'
)
or
my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
    index_endpoint_name='my_index_endpoint_id'
)

Parameters
Name	Description
`index_endpoint_name`	`str` Required. A fully-qualified index endpoint resource name or a index ID. Example: "projects/123/locations/us-central1/index_endpoints/my_index_id" or "my_index_id" when project and location are initialized or passed.
`project`	`str` Optional. Project to retrieve index endpoint from. If not set, project set in aiplatform.init will be used.
`location`	`str` Optional. Location to retrieve index endpoint from. If not set, location set in aiplatform.init will be used.
`credentials`	`auth_credentials.Credentials` Optional. Custom credentials to use to retrieve this IndexEndpoint. Overrides credentials set in aiplatform.init.

create

create(
    display_name: str,
    network: typing.Optional[str] = None,
    public_endpoint_enabled: bool = False,
    description: typing.Optional[str] = None,
    labels: typing.Optional[typing.Dict[str, str]] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    request_metadata: typing.Optional[typing.Sequence[typing.Tuple[str, str]]] = (),
    sync: bool = True,
    enable_private_service_connect: bool = False,
    project_allowlist: typing.Optional[typing.Sequence[str]] = None,
    encryption_spec_key_name: typing.Optional[str] = None,
    create_request_timeout: typing.Optional[float] = None,
) -> (
    google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchingEngineIndexEndpoint
)

Creates a MatchingEngineIndexEndpoint resource.

Example Usage:

my_index_endpoint = aiplatform.IndexEndpoint.create(
    display_name='my_endpoint',
)

Parameters
Name	Description
`display_name`	`str` Required. The display name of the IndexEndpoint. The name can be up to 128 characters long and can be consist of any UTF-8 characters.
`network`	`str` Optional. The full name of the Google Compute Engine `network https://cloud.google.com/compute/docs/networks-and-firewalls#networks` to which the IndexEndpoint should be peered. Private services access must already be configured for the network. If left unspecified, the network set with aiplatform.init will be used. `Format https://cloud.google.com/compute/docs/reference/rest/v1/networks/insert`: projects/{project}/global/networks/{network}. Where {project} is a project number, as in '12345', and {network} is network name.
`public_endpoint_enabled`	`bool` Optional. If true, the deployed index will be accessible through public endpoint.
`description`	`str` Optional. The description of the IndexEndpoint.
`labels`	`Dict[str, str]` Optional. The labels with user-defined metadata to organize your IndexEndpoint. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one IndexEndpoint (System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable.
`project`	`str` Optional. Project to create IndexEndpoint in. If not set, project set in aiplatform.init will be used.
`location`	`str` Optional. Location to create IndexEndpoint in. If not set, location set in aiplatform.init will be used.
`credentials`	`auth_credentials.Credentials` Optional. Custom credentials to use to create IndexEndpoints. Overrides credentials set in aiplatform.init.
`request_metadata`	`Sequence[Tuple[str, str]]` Optional. Strings which should be sent along with the request as metadata.
`sync`	`bool` Optional. Whether to execute this creation synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.
`enable_private_service_connect`	`bool` If true, expose the index endpoint via private service connect.
`project_allowlist`	`Sequence[str]` Optional. List of projects from which the forwarding rule will target the service attachment.
`encryption_spec_key_name`	`str` Optional. The Cloud KMS resource identifier of the customer managed encryption key used to protect the index endpoint. Has the form: `projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key`. The key needs to be in the same region as where the compute resource is created. If set, this index endpoint and all sub-resources of this index endpoint will be secured by this key. The key needs to be in the same region as where the index endpoint is created.
`create_request_timeout`	`float` Optional. The timeout for the request in seconds.

Exceptions
Type	Description
`ValueError`	A network must be instantiated when creating a IndexEndpoint.

delete

delete(force: bool = False, sync: bool = True) -> None

Deletes this MatchingEngineIndexEndpoint resource. If force is set to True, all indexes on this endpoint will be undeployed prior to deletion.

Parameters
Name	Description
`force`	`bool` Required. If force is set to True, all deployed indexes on this endpoint will be undeployed first. Default is False.
`sync`	`bool` Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

Exceptions
Type	Description
`FailedPrecondition`	If indexes are deployed on this MatchingEngineIndexEndpoint and force = False.

deploy_index

deploy_index(
    index: google.cloud.aiplatform.matching_engine.matching_engine_index.MatchingEngineIndex,
    deployed_index_id: str,
    display_name: typing.Optional[str] = None,
    machine_type: typing.Optional[str] = None,
    min_replica_count: typing.Optional[int] = None,
    max_replica_count: typing.Optional[int] = None,
    enable_access_logging: typing.Optional[bool] = None,
    reserved_ip_ranges: typing.Optional[typing.Sequence[str]] = None,
    deployment_group: typing.Optional[str] = None,
    auth_config_audiences: typing.Optional[typing.Sequence[str]] = None,
    auth_config_allowed_issuers: typing.Optional[typing.Sequence[str]] = None,
    request_metadata: typing.Optional[typing.Sequence[typing.Tuple[str, str]]] = (),
    deploy_request_timeout: typing.Optional[float] = None,
) -> (
    google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchingEngineIndexEndpoint
)

Deploys an existing index resource to this endpoint resource.

Parameters
Name	Description
`index`	`MatchingEngineIndex` Required. The Index this is the deployment of. We may refer to this Index as the DeployedIndex's "original" Index.
`deployed_index_id`	`str` Required. The user specified ID of the DeployedIndex. The ID can be up to 128 characters long and must start with a letter and only contain letters, numbers, and underscores. The ID must be unique within the project it is created in.
`display_name`	`str` The display name of the DeployedIndex. If not provided upon creation, the Index's display_name is used.
`machine_type`	`str` Optional. The type of machine. Not specifying machine type will result in model to be deployed with automatic resources.
`min_replica_count`	`int` Optional. The minimum number of machine replicas this deployed model will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed. If this value is not provided, the value of 2 will be used.
`max_replica_count`	`int` Optional. The maximum number of replicas this deployed model may be deployed on when the traffic against it increases. If requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the deployed model increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, the larger value of min_replica_count or 2 will be used. If value provided is smaller than min_replica_count, it will automatically be increased to be min_replica_count.
`enable_access_logging`	`bool` Optional. If true, private endpoint's access logs are sent to StackDriver Logging. These logs are like standard server access logs, containing information like timestamp and latency for each MatchRequest. Note that Stackdriver logs may incur a cost, especially if the deployed index receives a high queries per second rate (QPS). Estimate your costs before enabling this option.
`reserved_ip_ranges`	`Sequence[str]` Optional. A list of reserved ip ranges under the VPC network that can be used for this DeployedIndex. If set, we will deploy the index within the provided ip ranges. Otherwise, the index might be deployed to any ip ranges under the provided VPC network. The value sohuld be the name of the address (https://cloud.google.com/compute/docs/reference/rest/v1/addresses) Example: 'vertex-ai-ip-range'.
`deployment_group`	`str` Optional. The deployment group can be no longer than 64 characters (eg: 'test', 'prod'). If not set, we will use the 'default' deployment group. Creating `deployment_groups` with `reserved_ip_ranges` is a recommended practice when the peered network has multiple peering ranges. This creates your deployments from predictable IP spaces for easier traffic administration. Also, one deployment_group (except 'default') can only be used with the same reserved_ip_ranges which means if the deployment_group has been used with reserved_ip_ranges: [a, b, c], using it with [a, b] or [d, e] is disallowed. Note: we only support up to 5 deployment groups(not including 'default').
`auth_config_audiences`	`Sequence[str]` The list of JWT `audiences https://tools.ietf.org/html/draft-ietf-oauth-json-web-token-32#section-4.1.3`__. that are allowed to access. A JWT containing any of these audiences will be accepted. auth_config_audiences and auth_config_allowed_issuers must be passed together.
`auth_config_allowed_issuers`	`Sequence[str]` A list of allowed JWT issuers. Each entry must be a valid Google service account, in the following format: `service-account-name@project-id.iam.gserviceaccount.com` auth_config_audiences and auth_config_allowed_issuers must be passed together.
`request_metadata`	`Sequence[Tuple[str, str]]` Optional. Strings which should be sent along with the request as metadata.
`deploy_request_timeout`	`float` Optional. The timeout for the request in seconds.

find_neighbors

find_neighbors(
    *,
    deployed_index_id: str,
    queries: typing.Optional[
        typing.Union[
            typing.List[typing.List[float]],
            typing.List[
                google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery
            ],
        ]
    ] = None,
    num_neighbors: int = 10,
    filter: typing.Optional[
        typing.List[
            google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.Namespace
        ]
    ] = None,
    per_crowding_attribute_neighbor_count: typing.Optional[int] = None,
    approx_num_neighbors: typing.Optional[int] = None,
    fraction_leaf_nodes_to_search_override: typing.Optional[float] = None,
    return_full_datapoint: bool = False,
    numeric_filter: typing.Optional[
        typing.List[
            google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.NumericNamespace
        ]
    ] = None,
    embedding_ids: typing.Optional[typing.List[str]] = None
) -> typing.List[
    typing.List[
        google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor
    ]
]

Retrieves nearest neighbors for the given embedding queries on the specified deployed index which is deployed to either public or private endpoint.

Example usage:
    my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name='projects/123/locations/us-central1/index_endpoint/my_index_endpoint_id'
    )
    my_index_endpoint.find_neighbors(deployed_index_id="deployed_index_id", queries= [[1, 1]],)

Parameters
Name	Description
`deployed_index_id`	`str` Required. The ID of the DeployedIndex to match the queries against.
`queries`	`Union[List[List[float]], List[HybridQuery]]` Optional. A list of queries. For regular dense-only queries, each query is a list of floats, representing a single embedding. For hybrid queries, each query is a hybrid query of type aiplatform.matching_engine.matching_engine_index_endpoint.HybridQuery.
`num_neighbors`	`int` Required. The number of nearest neighbors to be retrieved from database for each query.
`filter`	`List[Namespace]` Optional. A list of Namespaces for filtering the matching results. For example, [Namespace("color", ["red"], []), Namespace("shape", [], ["squared"])] will match datapoints that satisfy "red color" but not include datapoints with "squared shape". Please refer to https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#json for more detail.
`per_crowding_attribute_neighbor_count`	`int` Optional. Crowding is a constraint on a neighbor list produced by nearest neighbor search requiring that no more than some value k' of the k neighbors returned have the same value of crowding_attribute. It's used for improving result diversity. This field is the maximum number of matches with the same crowding tag.
`approx_num_neighbors`	`int` Optional. The number of neighbors to find via approximate search before exact reordering is performed. If not set, the default value from scam config is used; if set, this value must be > 0.
`fraction_leaf_nodes_to_search_override`	`float` Optional. The fraction of the number of leaves to search, set at query time allows user to tune search performance. This value increase result in both search accuracy and latency increase. The value should be between 0.0 and 1.0.
`return_full_datapoint`	`bool` Optional. If set to true, the full datapoints (including all vector values and of the nearest neighbors are returned. Note that returning full datapoint will significantly increase the latency and cost of the query.
`numeric_filter`	`List[NumericNamespace]` Optional. A list of NumericNamespaces for filtering the matching results. For example: [NumericNamespace(name="cost", value_int=5, op="GREATER")] will match datapoints that its cost is greater than 5.
`embedding_ids`	`str` Optional. If `queries` is set, will use `queries` to do nearest neighbor search. If `queries` isn't set, will first use `embedding_ids` to lookup embedding values from dataset, if embedding with `embedding_ids` exists in the dataset, do nearest neighbor search.

list

list(
    filter: typing.Optional[str] = None,
    order_by: typing.Optional[str] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    parent: typing.Optional[str] = None,
) -> typing.List[google.cloud.aiplatform.base.VertexAiResourceNoun]

List all instances of this Vertex AI Resource.

Example Usage:

aiplatform.BatchPredictionJobs.list( filter='state="JOB_STATE_SUCCEEDED" AND display_name="my_job"', )

aiplatform.Model.list(order_by="create_time desc, display_name")

Parameters
Name	Description
`filter`	`str` Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported.
`order_by`	`str` Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: `display_name`, `create_time`, `update_time`
`project`	`str` Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used.
`location`	`str` Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used.
`credentials`	`auth_credentials.Credentials` Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.
`parent`	`str` Optional. The parent resource name if any to retrieve list from.

match

match(
    deployed_index_id: str,
    queries: typing.Optional[typing.List[typing.List[float]]] = None,
    num_neighbors: int = 1,
    filter: typing.Optional[
        typing.List[
            google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.Namespace
        ]
    ] = None,
    per_crowding_attribute_num_neighbors: typing.Optional[int] = None,
    approx_num_neighbors: typing.Optional[int] = None,
    fraction_leaf_nodes_to_search_override: typing.Optional[float] = None,
    low_level_batch_size: int = 0,
    numeric_filter: typing.Optional[
        typing.List[
            google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.NumericNamespace
        ]
    ] = None,
) -> typing.List[
    typing.List[
        google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchNeighbor
    ]
]

Retrieves nearest neighbors for the given embedding queries on the specified deployed index for private endpoint only.

Parameters
Name	Description
`deployed_index_id`	`str` Required. The ID of the DeployedIndex to match the queries against.
`queries`	`List[List[float]]` Optional. A list of queries. Each query is a list of floats, representing a single embedding.
`num_neighbors`	`int` Required. The number of nearest neighbors to be retrieved from database for each query.
`filter`	`List[Namespace]` Optional. A list of Namespaces for filtering the matching results. For example, [Namespace("color", ["red"], []), Namespace("shape", [], ["squared"])] will match datapoints that satisfy "red color" but not include datapoints with "squared shape". Please refer to https://cloud.google.com/vertex-ai/docs/matching-engine/filtering#json for more detail.
`per_crowding_attribute_num_neighbors`	`int` Optional. Crowding is a constraint on a neighbor list produced by nearest neighbor search requiring that no more than some value k' of the k neighbors returned have the same value of crowding_attribute. It's used for improving result diversity. This field is the maximum number of matches with the same crowding tag.
`approx_num_neighbors`	`int` The number of neighbors to find via approximate search before exact reordering is performed. If not set, the default value from scam config is used; if set, this value must be > 0.
`fraction_leaf_nodes_to_search_override`	`float` Optional. The fraction of the number of leaves to search, set at query time allows user to tune search performance. This value increase result in both search accuracy and latency increase. The value should be between 0.0 and 1.0.
`low_level_batch_size`	`int` Optional. Selects the optimal batch size to use for low-level batching. Queries within each low level batch are executed sequentially while low level batches are executed in parallel. This field is optional, defaults to 0 if not set. A non-positive number disables low level batching, i.e. all queries are executed sequentially.
`numeric_filter`	`Optional[list[NumericNamespace]]` Optional. A list of NumericNamespaces for filtering the matching results. For example: [NumericNamespace(name="cost", value_int=5, op="GREATER")] will match datapoints that its cost is greater than 5.

mutate_deployed_index

mutate_deployed_index(
    deployed_index_id: str,
    min_replica_count: int = 1,
    max_replica_count: int = 1,
    request_metadata: typing.Optional[typing.Sequence[typing.Tuple[str, str]]] = (),
    mutate_request_timeout: typing.Optional[float] = None,
)

Updates an existing deployed index under this endpoint resource.

Parameters
Name	Description
`deployed_index_id`	`str` Required. The user specified ID of the DeployedIndex. The ID can be up to 128 characters long and must start with a letter and only contain letters, numbers, and underscores. The ID must be unique within the project it is created in.
`min_replica_count`	`int` Optional. The minimum number of machine replicas this deployed model will be always deployed on. If traffic against it increases, it may dynamically be deployed onto more replicas, and as traffic decreases, some of these extra replicas may be freed.
`max_replica_count`	`int` Optional. The maximum number of replicas this deployed model may be deployed on when the traffic against it increases. If requested value is too large, the deployment will error, but if deployment succeeds then the ability to scale the model to that many replicas is guaranteed (barring service outages). If traffic against the deployed model increases beyond what its replicas at maximum may handle, a portion of the traffic will be dropped. If this value is not provided, the larger value of min_replica_count or 1 will be used. If value provided is smaller than min_replica_count, it will automatically be increased to be min_replica_count.
`request_metadata`	`Sequence[Tuple[str, str]]` Optional. Strings which should be sent along with the request as metadata.
`index_id`	`str` Required. The ID of the MatchingEnginIndex associated with the DeployedIndex.
`timeout`	`float` Optional. The timeout for the request in seconds.

read_index_datapoints

read_index_datapoints(
    *, deployed_index_id: str, ids: typing.List[str] = []
) -> typing.List[google.cloud.aiplatform_v1beta1.types.index.IndexDatapoint]

Reads the datapoints/vectors of the given IDs on the specified deployed index which is deployed to public or private endpoint.

Example Usage:
    my_index_endpoint = aiplatform.MatchingEngineIndexEndpoint(
        index_endpoint_name='projects/123/locations/us-central1/index_endpoint/my_index_id'
    )
    my_index_endpoint.read_index_datapoints(deployed_index_id="public_test1", ids= ["606431", "896688"],)

Parameters
Name	Description
`deployed_index_id`	`str` Required. The ID of the DeployedIndex to match the queries against.
`ids`	`List[str]` Required. IDs of the datapoints to be searched for.

to_dict

to_dict() -> typing.Dict[str, typing.Any]

Returns the resource proto as a dictionary.

undeploy_all

undeploy_all(
    sync: bool = True,
) -> (
    google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchingEngineIndexEndpoint
)

Undeploys every index deployed to this MatchingEngineIndexEndpoint.

Parameter
Name	Description
`sync`	`bool` Whether to execute this method synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

undeploy_index

undeploy_index(
    deployed_index_id: str,
    request_metadata: typing.Optional[typing.Sequence[typing.Tuple[str, str]]] = (),
    undeploy_request_timeout: typing.Optional[float] = None,
) -> (
    google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchingEngineIndexEndpoint
)

Undeploy a deployed index endpoint resource.

Parameters
Name	Description
`deployed_index_id`	`str` Required. The ID of the DeployedIndex to be undeployed from the IndexEndpoint.
`request_metadata`	`Sequence[Tuple[str, str]]` Optional. Strings which should be sent along with the request as metadata.
`undeploy_request_timeout`	`float` Optional. The timeout for the request in seconds.

update

update(
    display_name: str,
    description: typing.Optional[str] = None,
    labels: typing.Optional[typing.Dict[str, str]] = None,
    request_metadata: typing.Optional[typing.Sequence[typing.Tuple[str, str]]] = (),
    update_request_timeout: typing.Optional[float] = None,
) -> (
    google.cloud.aiplatform.matching_engine.matching_engine_index_endpoint.MatchingEngineIndexEndpoint
)

Updates an existing index endpoint resource.

Parameters
Name	Description
`display_name`	`str` Required. The display name of the IndexEndpoint. The name can be up to 128 characters long and can be consist of any UTF-8 characters.
`description`	`str` Optional. The description of the IndexEndpoint.
`labels`	`Dict[str, str]` Optional. The labels with user-defined metadata to organize your Indexs. Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain lowercase letters, numeric characters, underscores and dashes. International characters are allowed. See https://goo.gl/xmQnxf for more information on and examples of labels. No more than 64 user labels can be associated with one IndexEndpoint (System labels are excluded)." System reserved label keys are prefixed with "aiplatform.googleapis.com/" and are immutable.
`request_metadata`	`Sequence[Tuple[str, str]]` Optional. Strings which should be sent along with the request as metadata.
`update_request_timeout`	`float` Optional. The timeout for the request in seconds.

wait

wait()

Helper method that blocks until all futures are complete.

Class MatchingEngineIndexEndpoint (1.54.0) Stay organized with collections Save and categorize content based on your preferences.

Properties

create_time

deployed_indexes

description

display_name

encryption_spec

gca_resource

labels

name

private_service_access_network

private_service_connect_ip_address

public_endpoint_domain_name

resource_name

update_time

Methods

MatchingEngineIndexEndpoint

create

delete

deploy_index

find_neighbors

list

match

mutate_deployed_index

read_index_datapoints

to_dict

undeploy_all

undeploy_index

update

wait

Class MatchingEngineIndexEndpoint (1.54.0)