Storage-optimized Vector Search

The storage-optimized performance tier for Vector Search is designed for indexing and searching massive datasets. This tier implements a disk-based architecture instead of using RAM, significantly reducing your operational costs. When your priority is cost efficiency at scale as opposed to the lowest possible query latency, the storage-optimized tier is your best choice.

When to use a storage-optimized index

Consider storage-optimized indexes if you have any of the following:

A very large dataset: You must index very large numbers of vectors, and the cost of hosting a large number of performance-optimized shards is prohibitive.
A low-QPS workload: In low-query-volume applications, the cost savings from using fewer shards can be significant.
Flexible latency requirements: Your application can tolerate a minor increase in query latency, which is the time it takes to get a search result.

Performance trade-offs

Compared to the default performance-optimized index, a storage-optimized index has the following characteristics:

Increased query latency: Queries have a slightly higher latency at a given recall level.

How to configure a storage-optimized index

To create an index that is storage-optimized, set the shardSize parameter to SHARD_SIZE_SO_DYNAMIC in your index configuration.

Example: Creating a storage-optimized index

The following example demonstrates the JSON that's required to create a new storage-optimized streaming index.

{
  "displayName": "my-storage-optimized-index",
  "description": "An index configured to prioritize storage over performance.",
  "metadata": {
    "contentsDeltaUri": "gs://your-bucket/source-data/",
    "config": {
      "dimensions": 100,
      "approximateNeighborsCount": 150,
      "distanceMeasureType": "DOT_PRODUCT_DISTANCE",
      "shardSize": "SHARD_SIZE_SO_DYNAMIC"
    }
  },
  "indexUpdateMethod": "STREAM_UPDATE"
}

In the example, shardSize is set to SHARD_SIZE_SO_DYNAMIC, which instructs Vector Search to build a denser index. This allows each shard to hold significantly more data points, thereby reducing the total number of shards needed for your dataset. Other fields, such as dimensions and distanceMeasureType, are configured according to your needs.

Endpoints

Storage-optimized deployments can be used with any existing endpoint.

Deploy an index

The following example demonstrates the JSON required to deploy a storage-optimized index to an endpoint you've created.

{
  "deployedIndex": {
    "id": "PROJECT_UNIQUE_ID_NAME",
    "index": "projects/PROJECT_ID/locations/LOCATION/indexes/INDEX_ID",
    "displayName": "INDEX_DISPLAY_NAME",
    "deploymentTier": "STORAGE"
  }
}

Setting deploymentTier to STORAGE deploys the storage-optimized index with the specified displayName to an endpoint.

You can also specify the minimum replicant count (minReplicaCount) and maximum replicant count (minReplicaCount) to control the number of machine replicas to deploy on. Setting the machine type (machineType) is not supported.